In a stunning confession, Consulting titan Deloitte committed to a partial refund to Australia’s federal government after admitting it used artificial intelligence to help write a $440,000 report, which turned out to be riddled with glitches and fictitious references.
The Department of Employment and Workplace Relations also assured Deloitte it would repay the last payment of its contract, but how much is in dispute won’t be publicly released until the sale closes. The scandal’s uproar reached new heights at major consultant firms, following one Labor senator’s description of Deloitte having a “human intelligence problem.”
The trouble began when Deloitte was hired in December 2024 to review the government’s targeted compliance framework, a system used to automatically penalize welfare recipients who fail to meet their job-seeking obligations.
The review uncovered serious issues, including a lack of clear connection between the framework’s rules and actual legislation, as well as significant system defects. The report concluded that the IT system was “driven by punitive assumptions of participant non-compliance.”
The original report, released July 4, started to unravel in August when several inaccuracies, including quotes to sources that don’t even exist, were discovered by the Australian Financial Review. The department also re-uploaded a new version, fixing inaccuracies, quietly on Friday.
Deloitte Report’s AI “Hallucinations” Lead to Fictitious References and Nonexistent Court Case
Professor Dr. Christopher Rudge, a University of Sydney academic who initially mentioned the issues, noted that the report exhibited typical symptoms of AI “hallucinations”, when machine intelligence programs make things up by completing blank information or guessing things they themselves are ignorant of.
What’s more problematic is how Deloitte made their corrections. “Rather than swapping one hallucinated bogus reference for a new ‘true’ reference, they’ve replaced the bogus hallucinated references and in the revised version, there’s like five, six, seven, eight of them instead,” Rudge described.
That means what they did originally aren’t actually based on real evidence themselves to start off with.
Among its false references were fictitious reports that claimed to be authored by professors of the University of Sydney and at Lund University, Sweden. Perhaps seriously of all, its report cites a nonexistent court judgment, “Deanna Amato v Commonwealth”, there being no such case.
The revised version of the report actually includes a disclosure it did not originally have: a mention of it having involved generative AI, a particular tool known as Azure OpenAI GPT-4o, a large language model tool licensed by the department itself.
Deloitte’s Report Errors Raise Concerns Over Government’s Reliance on Pricey Consultants
In spite of this, Deloitte did not directly fault the AI for having errors. Both parties insist that its primary findings and recommendations have not changed despite reference issues. According to a department official, the review’s contents are intact, including its recommendations, with no aspect changed.
This view was echoed by a Deloitte statement, which “the updates made in no way impact or affect the substantive content, findings and recommendations in the report.” A spokesman for the firm added the issue had been “resolved directly with the client.”
The case rekindled concerns regarding the government’s overutilization of pricey consultant firms. Labor Senator Deborah O’Neill, who is a member of a Senate committee examining consultant firms’ integrity, did not hesitate to lambaste.
“Deloitte’s got a human intelligence issue. That’s laughable if it weren’t so sad,” O’Neill stated. “A partial refund is a partial apology for poor workmanship.”
She contended it seems “AI is being left to do the heavy lifting” and wondered if government departments should check exactly who, or what is actually doing work for money they pay. “Perhaps instead of a large consultant firm, procurers should pay for a subscription of ChatGPT,” she commented acidly.
When AI Conclusions Rest on False Data
Even though Dr. Rudge was involved in revealing the inaccuracies, he refrained from debunking the whole report. He mentioned how its conclusion tends to agree with other information regarding difficulties in the welfare compliance process.
It leads to a rather uncomfortable question: if reports written by computers are capable of drawing proper conclusions yet based them on false evidence, then what is their quality to be gauged?
This case illustrates the increasing difficulties governments encounter as professional services incorporate increasing amounts of artificial intelligence. Although automation technologies are capable of boosting efficiency, it shows how important it is for humans to perform verification and oversight, particularly when it pertains to documents affecting public policy and harming otherwise defenseless citizens.



