- Latest
- Topics
Philosophy
Mind & Behavior
Business
- VideosLatest VideosThe biological necessity of boredom in the age of screens"I call it a tyranny of attention because there's so many demands on our attention coming from so many different...We may find alien life, but will we be able to accept the consequences?Military satellite research brought us GPS. Astronomers influenced medical imaging tech. What would be invented after we discover alien life?...The most important piece of technology in your lifetime is this tiny chipChris Miller explains the hidden reason that global superpowers are obsessed with Taiwan.The brain-deep emotion that matters more than happinessHappiness collapses the moment hardship arrives. Joy doesn’t. Historian Kate Bowler explains why joy can coexist with pain — and...Why the search for alien life is about patience, not beliefAstronomer Jill Tarter explains why SETI is really about technology, patience, and learning how to tell alien signals from our...Thinking too logically can actually hold you back"Rationalism is the idea that, in order to truly know something, you have to be able to describe it explicitly."The hard problem of consciousness, in 53 minutes“Our conscious awareness is everything. And the fact that it's still so mysterious to scientists and to all of humanity,...Is there a Planet B? An astrophysicist answers. 30 years ago, we didn’t know other stars had planets orbiting them. Now, we may be on the verge of...
- ColumnsColumnsMini PhilosophyA philosophy column for personal reflection.Starts With A BangAn astrophysics column on big questions and our universe.BooksA literature column to feed your curiosity.The Long GameA business column on long-term thinking.Strange MapsA geography column on history and society.The WellA collection of essays and videos on life’s biggest questions.13.8A column at the intersection of science and culture.
- ClassesFeatured Classes
- My Account
- Sign In
- Membership
- More
“The General Index”: New tool allows you to search 107 million research papers for free
The creator of the index called it a public utility for accessing the “vast ocean” of human knowledge.

- Millions of research papers get published every year, but the majority lie behind paywalls.
- A new online catalogue called the General Index aims to make it easier to access and search through the world’s research papers.
- Unlike other databases which include the full text of research papers, the General Index only allows users to access snippets of content.
A new database aims to make it easier than ever to access and search through the world’s massive trove of research papers.
Each year, millions of scientific and academic papers get published across thousands of journals. The majority of those papers lie behind paywalls, costing $9 to $30 (or more) to read. Finding them can be difficult: Tools like Google Scholar allow you to search for paper titles and keywords, but more specialized queries are difficult.
The General Index was designed to reduce those obstacles without breaking the law. Developed by the technologist Carl Malamud and his nonprofit foundation Public Resource, the free-to-use index contains words and phrases from more than 107 million research papers, comprising 8.5 terabytes when compressed.
The General Index includes text from paywalled papers but not the whole text — only phrases up to five words long. This cut-off point was designed to keep the project in good legal standing. (The act of uploading millions of paywalled papers may prove morelegally ambiguous.)
The searchable content within the General Index includes:
- Billions of keywords (e.g., specific types of plants, genes, and materials)
- Paper titles
- Authors of research papers
- DOI article identifiers
Malamud described the index as a tool for mining the “vast ocean” of the world’s accumulated knowledge.
“This is a look-up tool, a dictionary of knowledge, a map to knowledge,” Malamud said in avideo. “A tool that we believe is an essential facility for the practice of science in our modern age. […] We view this as a public utility. We assert no ownership over the General Index. It is dedicated to the Public Domain — a series of unencumbered facts with which you can do what you will. There are no rights reserved.”
Should research papers be free?
The high cost of accessing research papers has long been controversial in the scientific community. Universities sometimes pay more than $10 million for an annual subscription to a suite of academic journals. Some of that money ends up going to nonprofits like the Massachusetts Medical Society, the American Medical Association, and the American Geophysical Union, and revenue is also sometimes used to fundstudent travel and other costs associated with institutional research.
However, the bulk of the revenue ends up in the pockets of major publishers. These for-profit companies, like Elsevier and Wiley, do not directly produce the research they publish; in fact, researchers often have to pay thousands of dollars to get published in major journals. The value that publishers bring to the table, in theory, is quality control through curation and peer review, functions that are not free.
But some in the community argue that research should be free to the public, and that the steep cost of accessing papers holds back scientific progress. That is the ethos behind the open-access movement. One key figure in the movement is the Kazakhstani computer programmer Alexandra Elbakyan. In 2011, she createdSci-Hub, an online database, or “shadow library,” that lets anyone with an internet connection access millions of research papers and books for free.
Some considered Sci-Hub to be an altruistic tool for advancing scientific knowledge and research. But publishers considered it scientific piracy. The general argument was that Elbakyan had not only stolen the text of journal articles but also the time and expertise of editors and reviewers, not to mention the costs associated with uploading and archiving all of the papers.
In 2015, Elsevier, which owns thousands of academic journals that generate more than $1 billion annually, sued Elbakyan for copyright infringement. She wrote aletter to the judge describing how she found it “insane” that she, as a graduate student, had to pay $32 per paper “when you need to skim or read tens or hundreds of these papers to do research.”
“Authors of these papers do not receive money,” Elbakyan wrote. “Why would they send their work to Elsevier then? They feel pressured to do this, because Elsevier is an owner of so-called ‘high-impact’ journals. If a researcher wants to be recognized, make a career — he or she needs to have publications in such journals.”
In an opinion piece published inThe New York Times, Elbakyan was quoted citing part of the United Nations Charter: “Everyone has the right to freely share in scientific advancement and its benefits.”
A more modest step toward open access
Although far from an act of piracy, it is still unclear the General Index will face any legal challenges. Malamud toldNature Newsthat he is “very confident” in the legality of his project. Over time, he and his colleagues hope to add new features to the database, such as one that shows how important certain terms are in the overall literature, a metric known asterm frequency-inverse document frequency (TFIDF).
“If we are to stand on the shoulders of giants, we must provide these maps to that vast world of ideas,” Malamud said in a video. “The General Index is but one tool.”
Related Content
Why Trump might soon make (some) scientists very happy
The White House is reportedly considering an executive order that would open up public access to scientific research.
Sci-Hub has changed how we access knowledge
It is opening the gates.
Meet the Robin Hood of Science, Alexandra Elbakyan
How one researcher created a pirate bay for science more powerful than even libraries at top universities.
Should scientific studies be available for free?
Plan S is starting to take hold, but the cost is merely shifting even more to the researchers.

























