Concerns are rising in the research community after dozens of academics reported receiving suspicious peer reviews they believe were written by AI.
The issue came to light around submissions for the 2026 International Conference on Learning Representations, or ICLR, one of the world’s most influential gatherings for machine-learning specialists.
Researchers began flagging strange patterns in their reviews: unusually long, overly polished paragraphs, vague feedback, hallucinated citations, and technical requests that made little sense for the field.
For many, the comments felt disconnected from their papers a red flag that the reviewers might have used large language models (LLMs) to produce the reports.
AI-Generated Reviews Roil ICLR 2026: Over 20% of Peer Assessments Found to be Fully Automated
One of the most vocal researchers, Graham Neubig from Carnegie Mellon University, received review reports that immediately felt off. Not only were they “very verbose with lots of bullet points,” he said, but they also demanded statistical analyses that were atypical for AI or machine-learning research.
Uncertain about how to prove that the reviews were AI-generated, Neubig turned to social media. He posted an offer on X (formerly Twitter), inviting anyone with the right tools to analyze conference submissions and identify AI-generated text.
The response was quicker than expected. The following day, an email came in from Max Spero, the chief executive officer of Pangram Labs, a company based in New York that is specialized in the detection of AI-generated text.
Spero and his team conducted a sweeping analysis of all 19,490 submissions and 75,800 peer reviews for ICLR 2026. The scale of the findings left the researchers stunned: about 21% of all peer reviews were completely written by AI applications, while over half contained at least some evidence of AI assistance.

“People were suspicious, but they didn’t have any concrete proof,” Spero said. “Over the course of 12 hours, we wrote some code to parse out all of the text content from these paper submissions.” The results, published online by Pangram Labs, quickly spread across academic networks.
Organizers have now confirmed that automated tools will be used to check whether reviewers breached policies on AI use. Bharath Hariharan, senior program chair for ICLR 2026, conceded that the conference has never dealt with AI-generated submissions or reviews on this scale. “After we go through all this process … that will give us a better notion of trust,” he said.
AI-Generated Content Found in Scientific Submissions and Peer Reviews, Raising Concerns About Academic Integrity
Peer reviews were not the only yardstick used by Pangram in evaluating the submissions. The team also reviewed the papers themselves and found that 199 manuscripts approximately 1% – were fully AI-generated, while another 9% contained more than 50% AI-written content.
While the majority of the submissions remained fundamentally human-written, the findings nevertheless underlined a deeply troubling trend: AI tools are being used substantially more often – and far more quietly – than the research world had previously presumed.
Ironically, Pangram Labs even submitted its preprint about the detection model to ICLR, only to find that two of its four peer reviews also exhibited signs of AI involvement; one was assessed as fully AI-generated.
To many of them, Pangram’s public release of the data just confirmed their previously submitted concerns. “One of them just seemed very disconnected from some of the main ideas in the work”, Desmond Elliott, a computer scientist at the University of Copenhagen, said. His PhD student noticed numerical inaccuracies, odd phrasing and fabricated details-all of which strongly suggested LLM use.
When the results from Pangram were in, Elliott immediately checked his submission. Indeed, the review his student had questioned was flagged as fully AI-generated. The manuscript had received the lowest score from that reviewer, placing it at the borderline between acceptance and rejection. “It’s deeply frustrating,” Elliott said.
Science Must Confront AI’s Influence on Research Evaluation
What this means for the integrity of peer review-a cornerstone of scientific progress-is a question now before the broader academic community. Many researchers acknowledge that AI tools can be helpful in editing or summarizing, but the use of LLMs to fully generate peer reviews without disclosure is widely viewed as unethical.
Critics argue that AI-generated feedback lacks nuanced understanding, domain expertise, and accountability in its evaluations of complex scientific work. The ICLR case reveals a growing problem: the more powerful and available AI becomes, the harder it is to tell the difference between real expert evaluation and automated text.
Now researchers want clearer policies, better detection tools, and stronger cultural norms to ensure transparency. For now, the organizers’ decision to investigate and enforce AI-use rules marks an important first step. But many academics say the community must act quickly before AI quietly reshapes peer review in ways that undermine trust, and ultimately, scientific progress.




