Wednesday, June 25, 2025

 
HomeFINANCEAI training is ‘fair use’ federal judge rules in Anthropic copyright case

AI training is ‘fair use’ federal judge rules in Anthropic copyright case

A federal judge in San Francisco has ruled that training an AI model on copyrighted works without specific permission to do so was not a violation of copyright law.

U.S. District Judge William Alsup said that AI company Anthropic could assert a “fair use” defense against copyright claims for training its Claude AI models on copyrighted books. But the judge also ruled that it mattered exactly how those books were obtained.

Alsup supported Anthropic’s claim that it was “fair use” for it to purchase millions of books and then digitize them for use in AI training. The judge said it was not okay, however, for Anthropic to have also downloaded millions of pirated copies of books from the internet and then maintained a digital library of those pirated copies.

The judge ordered a separate trial on Anthropic’s storage of those pirated books, which could determine the company’s liability and any damages related to that potential infringement. The judge has also not yet ruled whether to grant the case class action status, which could dramatically increase the financial risks to Anthropic if it is found to have infringed on authors’ rights.

In finding that it was “fair use” for Anthropic to train its AI models on books written by three authors—Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson—who had filed a lawsuit against the AI company for copyright violations, Alsup addressed a question that has simmered since before OpenAI’s ChatGPT kick-started the generative AI boom in 2022: Can copyrighted data be used to train generative AI models without the owner’s consent?

Dozens of AI-and-copyright-related lawsuits have been filed over the past three years, most of which hinge on the concept of fair use, a doctrine that allows the use of copyrighted material without permission if the use is sufficiently transformative—meaning it must serve a new purpose or add new meaning, rather than simply copying or substituting the original work. 

Alsup’s ruling may set a precedent for these other copyright cases—although it is also likely that many of these rulings will be appealed, meaning it will take years until there is clarity around AI and copyright in the U.S.

According to the judge’s ruling, Anthropic’s use of the books to train Claude was “exceedingly transformative” and constituted “fair use under Section 107 of the Copyright Act.” Anthropic told the court that its AI training was not only permissible, but aligned with the spirit of U.S. copyright law, which it argued “not only allows, but encourages” such use because it promotes human creativity. The company said it copied the books to “study Plaintiffs’ writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology.”

While training AI models with copyrighted data may be considered fair use, Anthropic’s separate action of building and storing a searchable repository of pirated books is not, Alsup ruled. Alsup noted that the fact that Anthropic later bought a copy of a book it earlier stole off the internet “will not absolve it of liability for the theft, but it may affect the extent of statutory damages.” 

The judge also looked askance at Anthropic’s acknowledgement that it had turned to downloading pirated books in order to save time and money in building its AI models. “This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,” Alsup said.

The “transformative” nature of AI outputs is important, but it’s not the only thing that matters when it comes to fair use. There are three other factors to consider: what kind of work it is (creative works get more protection than factual ones); how much of the work is used (the less, the better); and whether the new use hurts the market for the original.

For example, there is the ongoing case against Meta and OpenAI by comedian Sarah Silverman and two other authors, who filed copyright infringement lawsuits in 2023 alleging that pirated versions of their works were used without permission to train AI language models. The defendants recently argued that the use falls under fair use doctrine because AI systems “study” works to “learn” and create new, transformative content.

Federal District Judge Vince Chhabria pointed out that even if this is true, the AI systems are “dramatically changing, you might even say obliterating, the market for that person’s work.” But he also took issue with the plaintiffs, saying that their lawyers had not provided enough evidence of potential market impacts. 

Alsup’s decision differed markedly from Chhabria’s on this point. Alsup said that while it was undoubtedly true that Claude could lead to increased competition for the authors’ works, this kind of “competitive or creative displacement is not the kind of competitive or creative displacement that concerns the Copyright Act.” Copyright’s purpose was to encourage the creation of new works, not to shield authors from competition, Alsup said, and he likened the authors’ objections to Claude to the fear that teaching schoolchildren to write well might also result in an explosion of competing books.

Alsup also took note in his ruling that Anthropic had built “guardrails” into Claude that were meant to prevent it from producing outputs that directly plagiarized the books on which it had been trained.

Neither Anthropic nor the plaintiffs’ lawyers immediately responded to requests for comment on Alsup’s decision.



This story originally appeared on Fortune

RELATED ARTICLES

Most Popular

Recent Comments