NVIDIA has been accused of directly seeking access to millions of pirated books to train its AI models, according to a class-action lawsuit.
As originally reported by TorrentFreak, the complaint filed in the US District Court claims executives at NVIDIA authorized contact with Anna’s Archive, a site that offers access to millions of copyrighted books and academic papers for free.
Internal NVIDIA emails allegedly show a member of the company’s data strategy team reached out to Anna’s Archive to explore what the site could offer for large language model training.
NVIDIA accused of using pirated book site
As highlighted by TorrentFreak, the amended complaint claims that “competitive pressures drove NVIDIA to piracy,” alleging the company actively pursued access to illicit book datasets as demand for AI training data intensified.
The amended complaint also alleges NVIDIA relied on additional pirate sources, claiming the company downloaded copyrighted books from LibGen, Sci-Hub, and Z-Library.
According to the lawsuit, Anna’s Archive warned NVIDIA that its collection was illegally obtained and asked whether the company wanted to proceed. The complaint claims that permission was granted within a week, after which Anna’s Archive allegedly offered NVIDIA access to roughly 500 terabytes of data.
The filing also alleges that some of the material offered was normally only available through Internet Archive’s controlled digital lending system, which has itself been the subject of ongoing copyright cases.
Beyond direct use, authors accuse NVIDIA of distributing scripts and tools that enabled corporate customers to automatically download datasets containing pirated book content.
NVIDIA has previously defended its AI training practices as fair use, arguing that it created its AI Model NeMo in full compliance with copyright law.
This isn’t the first time Anna’s Archive has been linked to major tech companies. In December 2025, Spotify confirmed it was investigating claims that Anna’s Archive had scraped 300TB of data and uploaded it to the site.


