1. Home
  2. Press Releases
  3. Computer science: Improving AI-guided reviews of scientific literature (Nature)
Press release

Computer science: Improving AI-guided reviews of scientific literature (Nature)

5 February 2026

OpenScholar, an open-source language model that can outperform commercial large language models (LLMs) in performing accurate literature reviews is presented in Nature this week. For example, while GPT4o hallucinated citations around 78–90% of the time in experiments carried out as part of this study, OpenScholar’s citation accuracy is similar to that of human experts. Although further improvements may be needed, the tool has the potential to help scientists navigate the complex, ever-growing task of scientific literature review.

Reviewing scientific literature has an important role in supporting evidence-based decisions, fine-tuning scientific processes, and directing new discoveries. However, the increasing volume of publications makes it difficult for researchers to stay fully informed. LLMs may be of assistance, but they are prone to errors such as limited attribution and reference hallucinations.

With the goal of generating accurate, comprehensive, and transparent scientific literature reviews, Akari Asai, Hannaneh Hajishirzi and colleagues present OpenScholar. The model is a retrieval-augmented language model, specifically designed for scientific research tasks. Other systems have used this framework, but the authors combine it with a specialized data store of 45 million up-to-date open-access scientific papers and a self-assessment mechanism to refine its output. The authors also create a benchmarking tool called ScholarQABench to evaluate literature review automation. OpenScholar is shown to outperform existing systems such as GPT4o and PaperQA2 (a tool designed for literature synthesis) in correctness by 6.1% and 5.5%, respectively. In addition, OpenScholar generates answers that are more helpful than those produced by expert annotators around 50% to 70% of the time. These results, together with the substantial reduction in citation hallucinations, demonstrate the potential of OpenScholar to support and accelerate future research efforts, the authors conclude.

However, they note that the system still has limitations and emphasize that language model-based systems cannot fully automate scientific literature synthesis. They are making both ScholarQABench and OpenScholar available to the community to encourage ongoing research and refinement.

Asai, A., He, J., Shao, R. et al. Synthesizing scientific literature with retrieval-augmented language models. Nature (2026). https://doi.org/10.1038/s41586-025-10072-4

 © 2026 Springer Nature Limited. All Rights Reserved.  

More Press Releases

advertisement
PrivacyMark System