KPMG withdraws AI report due to hallucinations: UBS and NHS deny

TL;DR: KPMG withdrew an AI report after UBS, NHS, and other organizations denied claims about their use of artificial intelligence. The incident is an example of generative AI hallucination and underscores the need for human oversight in content creation.

What happened?

KPMG, one of the Big Four consulting firms, withdrew a report titled “Redefining excellence in the age of agentic AI” after multiple organizations pointed out that the claims about their use of artificial intelligence were false or misleading. According to The Next Web, citing information from the Financial Times, UBS, the UK's National Health Service (NHS), Swiss Federal Railways (SBB), and Transport for London (TfL) flatly denied the descriptions the report made of their respective AI implementations. The report, originally published on KPMG's website, cited agentic AI use cases supposedly being deployed by these entities, but none of them confirmed having implemented such systems. After the Financial Times' verification, KPMG withdrew the report without offering a detailed explanation, merely stating it was an error in the review process.

Why is it important?

This case is a paradigmatic example of hallucinations in generative language models: the invention of facts that seem plausible but are completely false. That a consultancy of KPMG's stature published and then withdrew a report with such errors undermines trust in the consulting industry and in AI technology itself. Moreover, it highlights the need for rigorous human oversight in automated content generation. Historically, the Big Four have been considered authoritative sources in business and technology advisory; however, incidents like this erode their credibility. This is not the first time a consultancy has faced criticism for exaggerating technological capabilities: in 2023, McKinsey was flagged for a report that overestimated AI's impact on productivity, though without inventing specific use cases. KPMG's case is more serious because it involves falsely attributing implementations to real organizations, which could have legal implications.

The impact on the consulting market could be significant. According to Statista data, the global AI consulting market reached $15 billion in 2025 and is expected to grow at a compound annual rate of 25% through 2030. Incidents like this could slow the adoption of generative AI-based consulting services, as clients will distrust the veracity of reports. Additionally, it could lead to increased regulation: the European Commission is already working on guidelines for AI use in professional services, and this case could accelerate their implementation.

Consequences and lessons

Damaged reputation: KPMG suffers a blow to its credibility as an advisor in digital transformation and AI adoption. The firm had already faced previous scandals, such as its role in the collapse of Carillion in 2018, which worsens the perception of lack of rigor.
Regulatory scrutiny: It could increase pressure on consultancies to verify the accuracy of their reports, especially those generated with AI assistance. In the UK, the Financial Reporting Council (FRC) has already expressed concern about the quality of Big Four reports after the Wirecard case.
Lesson for the sector: Companies must implement human review processes before publishing any AI-generated content, especially when citing real use cases. This includes verification with the cited sources, which apparently was not done in this case.
Impact on clients: The organizations mentioned without their consent could take legal action or demand public apologies. UBS has already stated it is considering measures, and the NHS has requested a formal explanation.

What should readers know?

This incident is not isolated. As more companies use language models to draft reports, whitepapers, and communications, the risk of hallucinations increases. A 2024 study from Stanford University found that generative language models invent facts in about 20% of responses when asked for specific information about companies. It is crucial that any data or claims attributed to third parties be verified with primary sources. AI is a powerful tool, but it does not replace human judgment or journalistic verification.

“Generative AI can produce very convincing texts, but without human oversight it can generate misinformation that damages the credibility of organizations that use it.”

For industry professionals, this case reinforces the importance of transparency and auditing in AI-assisted content creation processes. Companies should document which parts of a report were generated by AI and which were reviewed by humans. Additionally, they should implement automated verification tools, such as fact-checking systems that compare claims against external databases. The broader lesson is that trust in AI should not be blind: errors like this are avoidable with proper controls, and their reputational cost can be enormous.