Federal Judge Rules NVIDIA AI Training Scripts Have No Other Purpose Than Copyright Infringement
In a landmark ruling, U.S. District Judge Jon Tigar denied NVIDIA motion to dismiss a contributory copyright infringement lawsuit, finding that scripts distributed to clients for downloading pirated book datasets had no legitimate purpose — the first AI training case to apply the Supreme Court Cox v. Sony framework.

Federal Judge Rules NVIDIA AI Training Scripts Have No Other Purpose Than Copyright Infringement
A federal court has delivered a significant blow to NVIDIA in an ongoing AI copyright infringement case, ruling that scripts the chip giant distributed to corporate clients for downloading pirated book datasets served no purpose other than enabling infringement.
U.S. District Judge Jon Tigar of the Northern District of California issued the order on May 5, 2026, denying most of NVIDIA motion to dismiss the class action lawsuit brought by authors including Abdi Nazemian. The ruling marks the first time a court has applied the Supreme Court recent Cox Communications v. Sony Music framework to an AI training copyright case — and the result was not what NVIDIA hoped for.
Background: Authors vs. NVIDIA
The lawsuit, originally filed in early 2024, accuses NVIDIA of training its NeMo Megatron AI models on copyrighted books obtained from pirated sources. Specifically, the authors allege that NVIDIA used the Books3 dataset — a collection of copyrighted works sourced from the pirate site Bibliotik — as training data for its large language models.
As the case progressed, discovery revealed that NVIDIA had also contacted Annas Archive, one of the world largest shadow libraries, inquiring about high-speed access to its massive collection of pirated books. This revelation added fuel to the authors claims that NVIDIA knowingly sought out pirated content for AI training purposes.
The Cox v. Sony Standard
The ruling is particularly significant because it is the first to test AI training practices against the Supreme Court Cox v. Sony decision, which reshaped contributory copyright infringement law. That ruling, which wiped out a $1 billion piracy judgment against internet service provider Cox Communications, established that contributory infringement requires proof of active encouragement through specific acts rather than merely providing a platform that could be used for infringement.
NVIDIA argued that under this heightened standard, its NeMo Megatron Framework as a whole has substantial non-infringing uses and that plaintiffs needed to show NVIDIA marketed or promoted the framework as a piracy tool. The company essentially tried to shelter behind the same logic that protects general-purpose technologies from copyright liability.
Judge Tigar Key Finding
Judge Tigar rejected NVIDIA framing entirely. Rather than analyzing the Megatron framework as a monolithic product, the court zeroed in on specific scripts that NVIDIA distributed to corporate clients. These scripts were designed to automatically download and preprocess The Pile dataset — which contains the Books3 collection of pirated books.
The scripts are alleged to have no other purpose than to speed up the process of infringement, unlike the digital video recorder systems at issue in Sony Corp. or the internet service provided in Cox, Judge Tigar wrote in his order.
This distinction is crucial. By isolating the scripts from the broader framework, the court found that they satisfied both the inducement and tailored to infringement standards required for contributory infringement under the new Cox framework. In other words, NVIDIA did not just provide a general tool that happened to be misused — it provided purpose-built scripts whose only function was to facilitate the downloading of infringing content.
BitTorrent Is Merely a Tool
In a secondary but notable ruling, Judge Tigar also denied NVIDIA request to strike all references to BitTorrent from the case. NVIDIA had asked the court to dismiss allegations concerning its use of any BitTorrent Protocol, likely hoping to avoid the same fate as Meta, which faces direct copyright infringement claims in a parallel case for seeding pirated books via BitTorrent.
Judge Tigar found the request overly broad, noting that the complaint contains only one reference to BitTorrent — a descriptive line about how Bibliotik distributes pirated works. The judge offered a colorful analogy: Asking to dismiss allegations concerning BitTorrent is like asking to dismiss allegations concerning paintbrushes in a case about a dolphin painting.
The court emphasized that BitTorrent is merely a tool, not a library or dataset, signaling that the technology itself is not the issue — it is how it was used.
What NVIDIA Won
NVIDIA did secure one partial victory: the court dismissed the vicarious copyright infringement claim. To sustain that claim, the authors needed to show that NVIDIA had both the legal right to control the direct infringers and a direct financial interest in the infringement. Judge Tigar found neither was adequately pleaded, though he gave the authors 21 days to amend and refile.
Broader Implications for AI Industry
This ruling sends a clear signal to the AI industry: providing tools specifically designed to access pirated training data can constitute contributory copyright infringement, even under the Supreme Court more demanding Cox standard. Companies cannot hide behind the general-purpose nature of their broader platforms when they distribute targeted scripts for downloading infringing content.
The timing is also notable. Just days before this ruling, major publishers filed a separate lawsuit against Meta and Mark Zuckerberg, also alleging AI training on pirated books. The NVIDIA ruling may embolden plaintiffs in that case and others like it.
As AI companies face mounting legal pressure over their training data practices, this decision establishes an important precedent: the courts will look beyond marketing language and examine the specific tools and scripts companies provide. If those tools have no other purpose than facilitating access to pirated content, contributory infringement liability will follow.
The case continues, with discovery likely to reveal more about NVIDIA data acquisition practices for AI training. For the broader AI industry, the message is clear — how you obtain training data matters as much as what you do with it.
Case: Nazemian et al. v. NVIDIA Corporation, U.S. District Court, Northern District of California. Order issued May 5, 2026, by Judge Jon S. Tigar.
Related Articles
German Court Sets Twin Test for AI Images: No Copyright Without Human Creativity, No Infringement Without Copied Specifics
The Higher Regional Court of Düsseldorf (case I-20 W 2/26) holds that AI-generated images only quali...
Court RulingEU Top Court Backs Italy in Meta Press Publisher Copyright Fight
The Court of Justice of the European Union on May 12, 2026 upheld Italy's right to make Meta negotia...
Court RulingJudge Rules NVIDIA's Shadow Library Scripts 'Have No Other Purpose' Than Copyright Infringement
A federal judge denied NVIDIA's motion to dismiss a contributory copyright infringement lawsuit, rul...
Court RulingSupreme Court Denies AI Copyright Challenge
The Supreme Court denied certiorari in Thaler v. Perlmutter, confirming AI cannot hold copyrights.
AnalysisWhen Your Character Gets an AI Makeover: The BuzzFeed Cuppy Controversy and What It Means for Creator Rights
BuzzFeed greenlit an AI-generated Cuppy series through Amazon's Project Nara. Original creator Loryn...