Anna's Archive homepage (January 15, 2025) | |
Type of site | |
|---|---|
| Founder(s) | Anna Archivist, Pirate Library Mirror |
| URL | |
| Commercial | No |
| Registration | Optional |
| Launched | November 2022; 3 years ago (2022-11) |
| Part ofa series on |
| File sharing |
|---|
Video on demand sites |
File sharing networks |
Streaming programs |
Anonymous file sharing |
Development and societal aspects |
Anna's Archive is anopen sourcesearch engine forshadow libraries that was launched by thepseudonymous Anna shortly after law enforcement efforts toshut down Z-Library in 2022. The site aggregates records fromZ-Library,Sci-Hub, andLibrary Genesis (LibGen), among other sources. It calls itself "the largest truly open library in human history",[† 1] and has said it aims to "catalog all the books in existence" and "track humanity's progress toward making all these books easily availablein digital form". It claims not to be liable for downloads ofcopyrighted works, since the site indexesmetadata but does not directly host any files, instead linking to third-party downloads. It has nonetheless faced governmentblocks and legal action from copyright holders and publishingtrade associations for engaging in large-scalecopyright infringement.
Anna's Archive emerged out of thePirate Library Mirror (PiLiMi) project, an anonymous effort to mirror shadow libraries that completed a full copy of Z-Library in September 2022. PiLiMi acknowledged that it "deliberately violated the copyright law in most countries".[1][2] The project's initial focus was onpreservation rather than on making its data searchable.[3] Days after US law enforcement seized several Z-Library domains and arrested its alleged operators in November 2022, PiLiMi member Anna (also known as Anna Archivist) launched Anna's Archive, which initially displayed results from Z-Library and LibGen.[1][2][4][5]
Anna's Archive has been variously described as a search engine,[4] ametasearch engine,[1] and a shadow library itself.[2] The site does not itself host any files (which it claims makes it nonliable for downloads of copyrighted works), but it links to third-party downloads provided by anonymous partners.[† 1][6][7] It also offers downloads through theIPFS protocol.[a][1][8] Its source code is dedicated to thepublic domain under theCC0 license.[† 3] It operates threemirror sites under differenttop-level domains, currently.li,.se, and.org.[† 1]
The site's "source libraries" include LibGen, Sci-Hub, Z-Library, theInternet Archive (including "Borrowing Unavailable" items), DuXiu, MagzDB, Nexus/STC, andHathiTrust;Open Library,WorldCat, andGoogle Books are listed as metadata-only sources.[† 4] Some of these datasets are already publicly accessible, while others arescraped or otherwise privately acquired for distribution.[† 4][9] They are then released in bulk[b] withtorrent files so as to make them resilient towebsite takedowns.[† 1] As of July 2025,[update] Anna's Archive includes 52,875,045 books and 98,598,895 papers,[† 1][failed verification] and its unified list of torrents totals roughly 1.1 petabytes in size.[† 6]
A 2025 study comparing the coverage of conventionallibrary databases to various alternatives (including scholarly search engines, other web-based databases, academic social networks, and piracy sites) found that Anna's Archive had among the most comprehensive full-text coverage, but criticized it for having an unintuitive interface.[10] In March 2025, it averaged over 650,000 daily downloads, roughly 10 times the estimated distribution of theNew York Public Library.[11]
High-speed downloads on Anna's Archive are only available to users with a paid membership, while nonmembers must use slower options withbrowser verification to prevent abuse by bots. It describes itself as anonprofit, claiming that membership fees and donations are mostly spent on server infrastructure and that none are personally used by the site's operators.[† 1] It awards memberships and monetary "bounties" to some volunteer contributors.[† 7]
Anna's Archive offers high-speed access to its full collection viaSFTP to groups traininglarge language models (LLMs) in exchange for large contributions of money or data.[12] It said it provided such access to about 30 companies as of January 2025, primarily based in China, including both LLM companies anddata brokers.[13][14]DeepSeek's VL model was trained on data from the site.[15] Some lawyers have criticized claims that this constitutesfair use under US copyright law, citing precedent for the importance of market harm.[14]
Anna's Archive is a non-profit project with two goals:
1.Preservation: Backing up all knowledge and culture of humanity.
2.Access: Making this knowledge and culture available to anyone in the world.
Anna's Archive has said its objectives are to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form".[4] It has been described as both continuing and greatly extending the ambitions of earlier shadow libraries with its vision of a "universal library" that preserves as many books as possible. It has been interpreted as part of an ascendant "culture of mistrust towards corporations, institutions, governments, and laws... that perhaps began with thefinancial collapse of 2008 and theOccupy Wall Street movements" which saw the rise of decentralizing technologies.[11]
Anna has justified their opposition to copyright on ethical grounds, stating that they "believe that preserving and hosting these files is morally right"[11] and that they and other shadow librarians believe that "information wants to be free".[16] They have suggested thatcopyright law must be reformed as a matter ofnational security, proposing that Western countries make legal carveouts fortext and data mining so as to remain ahead in theAI arms race.[13]
Anna cites programmer and information activistAaron Swartz as inspiring the project's collection of metadata.[† 1] The site recommends Swartz's writings as well as Stephen Witt'sHow Music Got Free andMichele Boldrin andDavid K. Levine'sAgainst Intellectual Monopoly, which criticize existing copyright law and have been associated with thecopyleft movement.[11]

Since 2023, Anna's Archive domains have appeared in the annualNotorious Markets List of theOffice of the United States Trade Representative, which highlights digital and physical markets allegedly involved in large-scaleintellectual property infringement. These reports describe the site as related to Sci-Hub and LibGen.[17][18][19] In response to a request for comment by the Office on its 2023 List, theAssociation of American Publishers identified Anna's Archive as an infringing site, and analyzed itscryptocurrency wallets to find that it had received over $29,000 in funds as of July 2023.[20][21]
In response to a March 2024 lawsuit accusingNvidia of training LLMs ondata from a shadow library,[22] the company disputed the characterization of Anna's Archive and other repositories as "shadow libraries", despite Anna's own use of the term.[23][24][relevant?]
In October 2023, Anna's Archive was reported to have scraped the entirety ofWorldCat, the world's largestbibliographic database, and made its proprietary data freely available, which Anna described as "a major milestone in mapping out all the books in the world".[9]OCLC, WorldCat's maintainer, responded by suing the site inan Ohio federal court in January 2024, claiming the scrape was achieved throughcyberattacks on its servers.[6] It sought over $5 million in totaldamages and aninjunction to stop Anna's Archive from scraping or sharing its data.[25] OCLC clarified that although its internal systems were not breached, it believes the site's actions legally constitute hacking.[26] The only named defendant denied any involvement with the scrape or Anna's Archive.[27] Technology writerGlyn Moody criticized the suit as "costly and pointless", saying it went against OCLC's stated mission of making information accessible.[28]
In July 2024, in the wake of the suit, the.org mirror of Anna's Archive was replaced with a new.gs mirror to avoid falling under US jurisdiction; however, soon afterward, the.gs domain was suspended and the mirror reverted to the original.org domain.[25][29]
In March 2025, the court deferred judgement on aspects of the case to theSupreme Court of Ohio over concerns about its legal novelty, denying both a motion fordefault judgement from OCLC and amotion to dismiss from the named defendant.[30] In April, OCLC reached an agreement with the named defendant to drop her from the case, focusing instead on obtaining judgement against the site itself.[31]
In February 2025, internal emails were unsealed in a lawsuit againstMeta in a California court for allegedly training its AI models on copyrighted works which revealed that the company had downloaded over 81 terabytes of data through Anna's Archive torrents, in addition to data previously downloaded from LibGen. The plaintiffs in the case, a group of authors includingRichard Kadrey,Sarah Silverman, andChristopher Golden, alleged that CEOMark Zuckerberg personally authorized the use of shadow libraries. The company had argued that its use of copyrighted data in AI training constituted fair use.[32][33][34]
In June 2025, the court partially ruled in favor of Meta, finding that the training was "highly transformative" and therefore fair use.Vince Chhabria, the judge in the case, emphasized that the ruling did not mean that Meta's actions were in fact legitimate, but said that the plaintiffs failed to develop strong arguments. He identified "market dilution" as a convincing argument for financial harm not pursued by the plaintiffs — the idea that "by training generative AI models with copyrighted works, companies are creating something that often will dramatically undermine the market for those works".[35][36][37]
In January 2024,Italy's national communications agency ordered majorinternet service providers (ISPs) in the country to block Anna's Archive due to a copyright complaint by theItalian Publishers Association.[38] An investigation by the Digital Services Directorate confirmed the presence of copyrighted works on the site and found that some of its servers were likely owned by a Ukrainian hosting provider, but failed to uncover the identity of its operators.[2]
In March 2024, theRotterdamDistrict Court ordered major ISPs in the Netherlands to block Anna's Archive and LibGen due to a request by advocacy groupBREIN. The order was "dynamic", meaning that if the blocked sites changed domains or IP addresses in the future, ISPs would be obligated to update their blocks.[39][40][41][42]
In December 2024, the UKPublishers Association won an order from theHigh Court of Justice requiring major ISPs to block Anna's Archive and other copyright-infringing sites, extendinga list of sites blocked since 2015 under section 97A of theCopyright, Designs and Patents Act. The Association said it identified over one million records of copyrighted books and journal articles on Anna's Archive domains.[43][44]
In July 2025, a group of organizations representing Belgian authors and copyright holders – including the Association of Belgian Publishers (ADEB), the Civil Society of Multimedia Authors (La Scam), the Cooperative for the Perception and Compensation of Belgian Publishers (Copiebel), Librius, the Educational and Scientific Publishers Group (GEWU), the General Publishers Ground (GAU), and the Flemish Authors' Association (VAV) – successfully petititoned theCommercial Court to issue judgement against five alleged piracy sites: Anna's Archive, LibGen, Sci-Hub, Z-Library, andOceanofPDF. The judge orderedFPS Economy's anti-piracy service to block the sites in the interim. In the event of noncompliance, the sites face fines of up to 500,000 euros.[45][46][47][48]
On October 11, 2025,TorrentFreak reported that major ISPs in Germany had blocked access to the main domains of Anna's Archive. The blockade was initiated by theClearing Body for Copyright on the Internet (CUII), a coalition of rightsholders and ISPs that coordinates voluntary site blocking measures.[49]
Anna's Archive was amongGoogle Search's ten most reported domains forDMCA takedown by June 2024.[50] By November 2025, Google had removed 749 million Anna's Archive URLs from its search results, representing 5 percent of all takedown requests sent to the search engine since 2012. These requests came from over 1,000 authors and publishers.[7] It has been one of the most targeted sites of Dutch anti-piracy serviceLink-Busters, which sends takedown requests to Google and other search engines on behalf of major publishers.[51][52][53]
In January 2025, the messaging appTelegram suspended the Anna's Archive channel for copyright infringement, despite the operators reportedly taking precautions to avoid infringing posts on the app.Z-Library's Telegram channel was suspended the same week, and neither was alerted of the action. The removals were speculated to be linked to legal action byan Indian court.[54]