Two award-winning authors recently filed a lawsuit against OpenAI, alleging that the company violated copyright law by using their published books to train ChatGPT without obtaining their consent. The lawsuit, filed in late June, argues that ChatGPT’s underlying large language model “ingested” the copyrighted works of the plaintiffs, authors Mona Awad and Paul Tremblay. The authors claim that ChatGPT’s ability to generate detailed summaries of their books suggests that their results were included in the datasets used to train the AI system.
This legal action represents the growing tension between creative professionals and generative AI tools, which can produce text and images in seconds. Many individuals working in creative fields are concerned about how this rapidly advancing technology could impact their careers and livelihoods. As a result, these concerns may increasingly manifest through legal challenges.
In an interview with Insider, Daniel Gervais, a law professor at Vanderbilt University, recently discussed the rise in copyright cases against generative AI tools. According to Gervais, the ongoing writers’ lawsuit represents one of several such cases nationwide. However, he believes this is just the beginning and anticipates a surge in legal challenges as large language models and generative AI programs continue to advance and improve in replicating the artistic styles of writers and creators.
Author’s Legal Action Against AI Tools and Data Collection Practices
Gervais predicts many more authors will take legal action against companies developing these AI tools. As the capabilities of these programs progress, they become increasingly proficient at mimicking the unique styles and techniques of human writers and artists. Consequently, Gervais envisions a future inundated with legal disputes that specifically target the output generated by tools like ChatGPT on a national scale.
“This one,” Gervais stated while addressing the allegations of the lawsuit about AI data-scraping and training, “is truly dependent on the input. The output wave, however, is also on its way.”
Proving the author’s monetary damages resulting from OpenAI’s data collection practices, as alleged in the complaint, poses a significant challenge. Gervais, speaking to Insider, acknowledged the possibility that ChatGPT might have obtained Awad and Tremblay’s work from alternative sources rather than directly from the authors themselves. However, he also acknowledged the potential validity of the lawsuit’s claim that the bot “ingested” their books.
Andres Guadamuz, an AI and copyright expert at the University of Sussex, expressed a similar concern, stating to Insider that even if the books are present in OpenAI’s training datasets, the company may have lawfully acquired the content from another dataset.
Guadamuz further explained to The Guardian that it would be difficult to demonstrate that ChatGPT would have behaved differently if it had never accessed the authors’ work, given the extensive data it collects from the web.
Lawsuit Alleges Unlawful Acquisition of Personal Data for Training ChatGPT
Last week, the Authors Guild, an advocacy group in the US supporting the rights of writers, released an open letter urging the CEOs of Big Tech and AI companies to obtain permission from writers and compensate them fairly for using their copyrighted works in training generative AI programs. The letter has gained significant traction, with over 2,000 signatures, according to statements provided to Insider by the organization.
On the same day that OpenAI received another legal complaint, a lawsuit was filed by Awad and Tremblay. The complaint alleges that OpenAI unlawfully acquired “massive amounts of personal data,” which were then used to train ChatGPT. The 157-page complaint criticized OpenAI for collecting “virtually all data exchanged on the internet.”
Awad and Tremblay filed their lawsuit in a district court in Northern California. They are seeking damages and the recovery of their alleged lost profits.
The lawsuit also included documents that contained summaries of Awad’s novels, “13 Ways of Looking at a Fat Girl” and “Bunny,” as well as Tremblay’s novel, “The Cabin at the End of the World.” Tremblay’s novel was adapted into the M. Night Shyamalan film, “Knock at the Cabin.”
Insider contacted OpenAI and Awad for comment, but they did not respond. A representative for Tremblay declined to comment.