Semantic Scholar provides a one-sentence summary ofscientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices.[7] It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature is ever read.[8]
Another key AI-powered feature is Research Feeds, an adaptive research recommender that uses AI to quickly learn what papers users care about reading and recommends the latest research to help scholars stay up to date. It uses a paper embedding model trained using contrastive learning to find papers similar to those in each Library folder.[11]
Semantic Scholar also offers Semantic Reader, an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual.[12] Semantic Reader provides in-line citation cards that allow users to see citations withTLDR (short for Too Long, Didn't Read) automatically generated short summaries as they read and skimming highlights that capture key points of a paper so users can digest faster.
In contrast withGoogle Scholar andPubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper.[13] The AI technology is designed to identify hidden connections and links between research topics.[14] Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include theMicrosoft Academic Knowledge Graph, Springer Nature'sSciGraph, and the Semantic Scholar Corpus (originally a 45 million papers corpus in computer science, neuroscience and biomedicine).[15][16]
Each paper hosted by Semantic Scholar is assigned a uniqueidentifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example:
Liu, Ying; Gayle, Albert A; Wilder-Smith, Annelies; Rocklöv, Joacim (March 2020). "The reproductive number of COVID-19 is higher compared to SARS coronavirus".Journal of Travel Medicine.27 (2).doi:10.1093/jtm/taaa021.PMID32052846.S2CID211099356.
One study compared the index scope of Semantic Scholar to Google Scholar, and found that for the papers cited by secondary studies in computer science, the two indices had comparable coverage, each only missing a handful of the papers.[17]
As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers fromcomputer science andbiomedicine.[18] In March 2018, Doug Raymond, who developedmachine learning initiatives for theAmazon Alexa platform, was hired to lead the Semantic Scholar project.[19] As of August 2019[update], the number of included papers metadata (not the actual PDFs) had grown to more than 173 million[20] after the addition of theMicrosoft Academic Graph records.[21] In 2020, a partnership between Semantic Scholar and theUniversity of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus.[22] At the end of 2020, Semantic Scholar had indexed 190 million papers.[23] In 2020, Semantic Scholar reached seven million users per month.[7]
^Matthews, David (1 September 2021)."Drowning in the literature? These smart software tools can help".Nature. Retrieved5 September 2022....the publicly available corpus compiled by Semantic Scholar – a tool set up in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington – amounting to around 200 million articles, including preprints.
^"Semantic Scholar".International Journal of Language and Literary Studies. Retrieved2021-11-09.
^Baykoucheva, Svetla (2021).Driving Science Information Discovery in the Digital Age. Chandos Publishing. p. 91.ISBN978-0-12-823724-3.OCLC1241441806.
^Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo; Ferro, Nicola; Silva, Mário J.; Martins, Flávio (2020).Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I. Cham, Switzerland: Springer Nature. p. 254.ISBN978-3-030-45438-8.OCLC1164658107.
^Ammar, Waleed (2019)."Open Research Corpus".Semantic Scholar Lab Open Research Corpus. Archived fromthe original on 2019-03-29. Retrieved2024-08-05.