Software factories: bugs increase 54% with AI

TL;DR: Generative AI is driving software factories, but data shows an alarming increase in bugs and incidents. Speed without quality control turns the productivity promise into a technical debt trap.

What happened?

The massive adoption of language models (LLMs) such as GitHub Copilot, ChatGPT, and other code assistants has drastically lowered the barrier to entry for writing software. Companies of all sizes are embracing the concept of the software factory, an approach that seeks to industrialize code production through AI agents, automated CI/CD pipelines, and accelerated reviews. However, recent data from Faros AI reveals a dark side: although developer throughput has increased by 33.7% and the PR merge rate by 16.2%, incidents per PR have skyrocketed by 242.7% and bugs per developer have grown by 54%.

This phenomenon is not isolated. According to VentureBeat, the idea of a "software factory" has solidified over the past year, driven by Luca Rossi's article "The Era of the Software Factory," which argues that AI changes not just writing speed but the entire production system. Yet the reality is that many companies are deploying agents and plugins without an orchestration platform, turning the factory into an "improvised workshop." As VentureBeat notes: "If you just put another machine in an empty room and call it a factory, you're not building a factory."

Why is it important?

The software factory concept promises greater speed and lower cost, similar to what assembly-line production achieved in manufacturing. But unlike physical goods, software accumulates technical debt invisibly. Speed without quality control creates a snowball effect: more code, more bugs, more incidents. As VentureBeat warns, many companies believe they are building a software factory when in reality they are just shipping errors faster.

Historically, industrial manufacturing faced similar problems. During the Industrial Revolution, mass production without quality control led to disasters such as bridge collapses or boiler explosions. The response was standardization and statistical process control, developed by Walter Shewhart in the 1920s. In software, the analogy is clear: we need quality metrics equivalent to "defects per million" in manufacturing. Faros AI and Google DORA are leading this path, but the industry is still far from applying these principles broadly.

Consequences for companies and users

For companies, the increase in bugs and incidents translates into higher maintenance costs, loss of customer trust, and potential turnover of talent frustrated by dealing with low-quality code. A Stripe study found that developers spend up to 42% of their time fixing bugs and technical debt—time that could be spent on innovation. With AI, this percentage could increase if quality is not controlled.

End users experience more failures, lower performance, and updates that fix one problem but create three new ones. Recent examples include the controversy over Microsoft's AI assistant generating insecure code in critical applications, or the case of a startup that had to roll back a massive update after a wave of incidents caused by unreviewed AI-generated code. Google DORA research also indicates that delivery speed does not correlate with stability when AI intervenes without proper human oversight. In fact, organizations with high speed but low stability ("elite teams" per DORA) suffer the most when AI accelerates production without controls.

What should readers know?

Not all speed is good: AI accelerates writing, but not domain understanding or requirements validation. The bottleneck shifts from "how do I write this?" to "should this be written?" As VentureBeat notes, the barrier to writing functional code has collapsed, but that doesn't mean the code is correct or maintainable.
Human review is key: AI tools must be integrated into a rigorous review and testing system. Without it, technical debt grows exponentially. GitClear data shows that AI-generated code has a bug reintroduction rate 40% higher than human code, underscoring the need for review.
Misleading metrics: Increasing PR throughput without improving quality is counterproductive. Companies should measure not only speed but also error rates and incident resolution time. Faros AI recommends a "balanced scorecard" that includes mean time to recovery (MTTR) and deployment failure rate.
The factory requires a platform, not patches: A true software factory needs an orchestrator that manages agents, tests, deployments, and feedback—not just a collection of loose prompts and plugins. VentureBeat compares this to physical factories: "You can't have a sewing machine, a lathe, and an oven in a shed and call it a factory."

"When you increase a person's output with machinery, you also increase the errors they can make. The speed at which code can now be generated is industrial scale." — VentureBeat

Looking ahead

The industry is at a crossroads: either develop better AI-assisted quality assurance practices, or the bubble of AI-generated code will burst in the form of a maintainability crisis. Startups like Faros AI and DORA research are paving the way for metrics that balance speed and quality. The success of the software factory will depend on whether companies learn to apply the quality control principles that manufacturing perfected over decades.

In the near future, we will see a rise in "quality as code" tools that integrate automated testing, static analysis, and AI review. Roles such as "AI quality engineer" specialized in validating model-generated code will also emerge. But the deepest change will be cultural: companies must accept that speed without control is unsustainable. As Luca Rossi said, "the software factory is not a tool, it's a set of principles." Adopting those principles will be the difference between building a digital empire or a house of cards.

Software Factories: LLMs Accelerate Production but Increase Bugs by 54%

What happened?

Why is it important?

Consequences for companies and users

What should readers know?

Looking ahead

Keep reading