The ongoing legal battle between Asian News International (ANI) and artificial intelligence giant OpenAI intensified during recent Delhi High Court proceedings, as ANI’s lawyer Sidhant Kumar alleged that OpenAI continues to access ANI’s content through its subscribers despite previous claims to the contrary.
During the March 18 hearing, Kumar demonstrated that when ChatGPT was prompted to provide “latest headlines” from ANI’s news portal, the AI generated recent news articles from ANI alongside links to the source.
This directly contradicts OpenAI’s previous declaration that it had stopped scraping ANI’s website since October last year.
“This is a clear demonstration that quite apart from the question of my subscribers’ website being scraped, according to them, under a false defense of public availability. They are continuing to scrape my website,” Kumar told the court.
News Agency Sues OpenAI: Copyright Infringement in AI Training
The news agency initiated legal action against OpenAI last year for copyright infringement, claiming the AI company trained its products using ANI’s copyrighted material without permission. ANI is currently seeking an interim injunction to prevent OpenAI from collecting or using its content.
Kumar argued that while facts themselves cannot be copyrighted, the expression of those facts can be.
He cited the landmark R.G. Anand vs Delux Films judgment to support this position, explaining: “When I publish an article, which is authored by my journalists and vetted by my editors, it is a particular way in which I will narrate it. I cannot have a monopoly over reporting that particular fact, but I certainly do have a monopoly over reporting that fact in that manner.”
ANI’s lawyer also highlighted that the news agency employs translators to produce content in various languages.
He provided an example where ChatGPT offered a verbatim quote from Indian athlete Neeraj Chopra’s mother, originally given in Punjabi and translated to English by ANI. Kumar maintained that these translations independently have copyright protection.
Tokenization, Thin Copyright, and Search: Key Issues in OpenAI-ANI Case
OpenAI’s defense, presented by lawyer Amit Sibal, acknowledged that journalists hold copyright over their expression in news reports but argued that ANI does not hold copyright over quotations within their work. “The fact that a celebrity utters a quotation is a matter of fact over which he has no copyright,” Sibal stated.
Sibal further claimed that news only had “thin copyright” since multiple journalists covering the same facts might express them similarly. He also distinguished between OpenAI’s search function and its training process, stating that the company had blocklisted ANI’s website from training data, while the search function merely crawls the web and provides summaries in ChatGPT’s own words without reproducing headlines.
ANI countered by arguing that OpenAI’s search function lacks transparency and that the company has resisted providing information about its operations.
Kumar also drew a contrast between ChatGPT’s search capability and traditional search engines like Google, noting that Google has a licensing agreement with ANI to display headlines and “merely indexes and gives you a preview of the exact page… It doesn’t host content. It doesn’t store content.”
The dispute also touches on technical aspects of AI development, with ANI arguing that the tokenization process used by OpenAI simply converts human-readable content to machine-readable format and back again, constituting an adaptation of their work without adding any creativity or skill that would create genuine derivative works.
ANI maintains that under Section 14 of India’s Copyright Act, they hold exclusive rights to store and issue copies of their content in any medium, as well as the right to make adaptations, and that OpenAI’s actions constitute infringement under Section 51.
The case continues to raise important questions about copyright in the digital age and the responsibilities of AI companies when using online content for training their models.