Reddit has filed a lawsuit against Anthropic, the artificial intelligence company known for developing the Claude chatbot, accusing the firm of improperly harvesting Reddit’s user-generated content to train its AI models. The complaint, submitted to the San Francisco Superior Court on Wednesday, claims Anthropic accessed Reddit more than 100,000 times since July 2024 through automated bots—despite allegedly telling Reddit it had stopped doing so.
The legal challenge marks a significant escalation in tensions between platforms that host human-created content and AI developers increasingly reliant on that data to power their technologies.
Public Promises, Private Violations?
In its court filing, Reddit portrays Anthropic as a company that outwardly claims to follow ethical practices while privately disregarding those commitments. The lawsuit describes the AI firm as a “late-blooming” player in the industry that has marketed itself as a responsible alternative to bigger AI competitors—while allegedly violating digital boundaries behind the scenes.
“This case is about the two faces of Anthropic,” the lawsuit states. “The public face that claims to respect boundaries and the law, and the private face that ignores any rules that interfere with its efforts to profit.”
As of now, Anthropic has not publicly responded to the accusations.
Reddit Claims Billion-Dollar Impact
Reddit argues that its data—millions of conversations and comments collected over nearly two decades—is uniquely valuable in today’s AI-driven world. In a statement to The Verge, Reddit’s Chief Legal Officer Ben Lee said that the content allegedly misused by Anthropic could be worth billions.
“Reddit’s humanity is uniquely valuable in a world flattened by AI,” Lee said. “People are increasingly seeking real, authentic conversations—something Reddit has been cultivating for 20 years across virtually every topic imaginable.”
According to Lee, Reddit’s archives of nuanced, human-generated discourse are particularly important for building large language models (LLMs) like Claude, which depend heavily on high-quality training data.
A Strategic Shift Toward Licensing
Reddit’s lawsuit comes as the company actively seeks to monetize its data through legitimate licensing deals. In February 2024, Reddit signed a reported $60 million-per-year agreement with Google, granting the tech giant access to Reddit content for AI training. That move highlighted a broader shift in Reddit’s business strategy: transforming its vast community-generated data into a premium, licensed asset.
By suing Anthropic, Reddit appears determined to protect the commercial value of its platform and set clear limits on how its data can be used by third parties, especially AI companies that might try to sidestep licensing arrangements.
A Pattern of Legal Trouble for Anthropic
This isn’t the first time Anthropic has faced legal challenges over how it gathers and uses copyrighted materials. In August 2023, three authors filed a class-action lawsuit against the company in federal court, alleging that Anthropic used their copyrighted books without permission to train Claude. That lawsuit argued the company had built a multibillion-dollar operation by improperly sourcing content.
Later that year, in October, Universal Music Group sued Anthropic in Tennessee federal court, accusing the startup of widespread copyright infringement involving song lyrics. The case highlighted concerns from the entertainment industry about AI systems generating or reproducing copyrighted content without proper licensing.
A Broader Legal Battle in the AI World
Reddit’s lawsuit joins a growing list of legal actions targeting AI companies over their use of protected or proprietary data. OpenAI, one of the industry’s most prominent players, has been sued multiple times. The New York Times launched a high-profile case in late 2023, accusing OpenAI of using its journalism without consent. Around the same time, authors including George R.R. Martin filed a class-action lawsuit over similar concerns.
Additionally, a consortium of newspaper publishers—such as The New York Daily News and The Chicago Tribune—has also taken legal action against OpenAI. And more recently, Cohere, another AI startup, was sued by major publishers including Condé Nast and Vox Media for allegedly using copyrighted articles in its training processes.
These lawsuits reflect a larger reckoning in the AI space, as courts are increasingly asked to determine how AI firms can legally access and use the vast amounts of human-made content available online.