A federal judge has handed artificial intelligence firm Anthropic a significant victory, holding that Anthropic’s utilization of copyrighted books as input to train its Claude AI model was “fair use” under copyright law.
U.S. District Judge William Alsup made the decision late Monday, deciding that the Amazon-capital-backed AI company had not infringed authors’ copyrights with its training practices, which were “transformative.” The ruling has the potential to establish significant legal precedent as the AI sector is being increasingly taken to task legally for how companies are utilizing copyrighted material to train their large language models.
The Court’s Rationale
Judge Alsup underlined that Anthropic’s AI models did not appropriate the creative content of copyrighted material or imitate a particular author’s unique style. “The purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative,” he stated in his ruling.
The judge made an interesting comparison, equating the teaching of AI to the learning of humans: “Like any reader wanting to become a writer.” The comparison is that just as humans are able to read copyrighted material and use that to create original writing, AI programs can take existing writing and use that to create new writing without violating copyrights.
History of the Lawsuit of Anthropic
The lawsuit started in August when three writers Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, sued Anthropic in the U.S. District Court for the Northern District of California. They alleged that the company had constructed a “multibillion-dollar business by pirating hundreds of thousands of copyrighted books.”
The lawsuit highlighted a controversial aspect of AI creation: the acquisition and use of training data by corporations. The authors alleged that Anthropic had amassed a vast library of copyrighted material without permission to train its AI models to comprehend and generate human-like text.
Although Judge Alsup favored Anthropic in the issue of fair use, he did not exonerate the company. The case proved that Anthropic had built what the company referred to as a “central library” consisting of approximately 7 million pirated books. However, the firm never used the pirated works to train its language models.
In spite of this ruling, Alsup directed a trial independently to investigate how these pirated publications were utilized in the creation of Anthropic’s primary library. This trial will decide the amount of damages, if any, the company must pay for possessing these pirated materials.
“That Anthropic subsequently purchased a copy of a book it previously pirated from the internet will not absolve it of liability for the piracy, but may impact the level of statutory damages,” the judge wrote, suggesting that Anthropic may still be punished, but buying legitimate copies may lessen the punishment.
Industry Implications of AI tools like Anthropic
This decision marks a milestone for the AI sector, which has been fighting several copyright lawsuits filed by authors, publishers, and other creators. Some of the companies that have been targeted with similar suits over their training regimen include OpenAI, Google, and Meta.
The ruling begins to establish more specific legal boundaries for AI companies, potentially as a guide on how they can use copyrighted works in their creation. The “transformative use” test could be a key defense for other AI companies that are brought under similar suits.
Anthropic welcomed the ruling, with a company spokesperson saying the company was “pleased” with the ruling. The spokesperson also described the ruling as being “consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”
This viewpoint is an extension of the general AI industry argument that their technologies build upon human knowledge and creativity, not merely replicating current works.
What’s Next
Though this decision is a victory for Anthropic and possibly other AI firms, AI training law is still complicated. The trial of the pilfered books is still to be determined and must be decided, and other suits between the various companies and other situations still make their way through the courts.
The ruling also doesn’t fully resolve all the complexities of AI training and copyright legislation. As AI technology advances and becomes more advanced, courts are guaranteed to be forced to continue to flesh out their knowledge of how copyright concepts apply to the new technology.
At least in the short term, however, AI firms have at least some comfort that their training practices may be safeguarded by the existing fair use doctrine, provided that they can prove the transformative character of their practices.