Qualcomm HBC: DRAM on Chips for AI

TL;DR: Qualcomm proposes stacking DRAM on its AI accelerators to shorten distances and multiply bandwidth. It promises 133 TB/s effective in its AI250, but the figures have raised doubts about their veracity.

What happened?

During its 2026 Investor Day, Qualcomm revealed a near-memory computing architecture it calls High-Bandwidth Compute (HBC). The idea is to stack successive layers of DRAM directly on top of XPU chips (its AI accelerators) using through-silicon vias (TSVs), forming a single compute-and-memory module. According to Tony Pialis, EVP of datacenter at Qualcomm, this offers “all the performance advantages of SRAM, but with the density and capacity of HBM stacks.”

The technology will debut next year in the Dragonfly systems of the AI250 series, with which Qualcomm aims to compete directly with Nvidia and AMD in AI inference in the data center. The promised figures are spectacular: 768 GB of capacity and 133 TB/s of effective bandwidth per card. For context, Nvidia's Groq 3 LPU offers only 500 MB of SRAM and 150 TB/s of real bandwidth.

Why is it important?

The bottleneck in AI inference is not so much compute capacity as the speed at which data travels between memory and processors. In current architectures, moving data between HBM and compute dies consumes a lot of energy and time. By placing DRAM on top of compute, Qualcomm drastically reduces the distance data travels, resulting in lower latency and lower energy consumption. “Imagine working in the same building where you live, so you only travel up and down,” Pialis explained. “The highways connecting the suburbs to the city are cleared.”

Additionally, by performing bandwidth-intensive operations on the base die, the amount of data that must be transferred to the SoC is reduced, amplifying the effective bandwidth. This explains why Qualcomm uses the term “effective” so generously. The company claims the AI250 will offer 18 times the effective bandwidth of the previous generation AI200, and the future AI300 will reach 54 times.

What consequences will it have?

If Qualcomm delivers on its promises, HBC could change the game in AI inference, offering a more energy-efficient and potentially cheaper alternative to traditional GPUs. By eliminating the need for expensive silicon interposers (like TSMC's CoWoS), manufacturing costs could be reduced. However, the bandwidth figures have been met with skepticism. The Register notes that for the previous generation AI200, Qualcomm claimed 414 TB/s of “effective” bandwidth across 56 chips, but achieving that with LPDDR5x at 8800 MT/s would require a 6720-bit bus, something the company has not confirmed. Qualcomm insists it is “the pure physical bandwidth of the LPDDR interface,” but has not provided details on how it achieves this.

Furthermore, stacking DRAM on top of logic poses thermal challenges. Although Qualcomm claims lower power consumption reduces heat, power density in the upper dies could be an issue. The company has not published detailed thermal specifications.

What should readers know?

Speculation vs. confirmation: The bandwidth figures are “effective” and not necessarily comparable to those of Nvidia or AMD. Qualcomm has not revealed the actual physical bandwidth or how it achieves those multipliers.
Market impact: If HBC works, it could democratize AI inference by reducing costs and power consumption. But Qualcomm is late to the data center and needs to prove its solution scales and is reliable.
Historical context: Near-memory computing is not new (Cerebras, Groq, Samsung), but Qualcomm is betting on stacking DRAM on top of logic, an approach other companies have avoided due to heat and performance issues.
For businesses and users: AI customers seeking energy efficiency should closely follow HBC developments, but with caution until independent benchmarks are available.

“We offer all the performance advantages of SRAM, but with the density and memory capacity that HBM stacks provide” — Tony Pialis, EVP of datacenter at Qualcomm.

Conclusion

Qualcomm has presented a bold vision to overcome the AI memory wall, but the performance claims need validation. If the company manages to execute its plan, it could become a relevant player in AI infrastructure. Otherwise, it risks being remembered as another case of exaggerated marketing. The coming months, with the first AI250 systems, will be crucial.

Qualcomm Buries Chips Under DRAM to Break the AI Memory Wall

What happened?

Why is it important?

What consequences will it have?

What should readers know?

Conclusion

Keep reading