2026 AI Capex Surges to $715 Billion as HBM Chip Supply Sells Out

Bottom Line First

The four tech giants (Amazon, Google, Meta, Microsoft) are projected to spend $715 billion on AI capex in 2026, with nearly all incremental spend AI-driven.

Meanwhile, the bottleneck for AI compute is shifting from GPUs to HBM (High Bandwidth Memory)—Micron’s CEO admitted on the latest earnings call that 2026 HBM supply is completely sold out, meeting only 50-65% of customer demand.

Data Breakdown

2026 AI Capex by Company

Company	2026 Capex Ceiling	YoY Growth	Primary Use
Amazon (AWS)	~$200B	Accelerating (Bedrock spend reaccelerated to fastest in 15 quarters)	GPU clusters + data centers + power
Google	~$190B	Sustained growth	TPU + GPU + data center infrastructure
Microsoft	~$190B	Maintaining high levels	Azure AI + OpenAI infrastructure
Meta	~$135B	Significant increase	Llama training + AI ads + metaverse
Total	~$715B	—	—

Supply Chain Bottleneck Shift

Phase	Bottleneck	Current Status
2023-2024	GPU capacity (NVIDIA A100/H100)	Massive capacity expansion, easing
2025	Advanced packaging (CoWoS)	TSMC expanding
2026	HBM memory	Industry-wide sold out, supply crunch

HBM Market Landscape

Supplier	Market Share	2026 Capacity Status	Notes
SK Hynix	~50%	Q1 revenue tripled YoY, surpassed 50T KRW for first time	Announced $13B expansion plan
Micron	~25%	Can only meet 50-65% of demand	Multi-year volume and pricing agreements locked
Samsung	~20%	Catching up	HBM3E production ramping
Others	~5%	—	—

Why AI Is Becoming “Memory-First”

Micron’s CEO delivered a key signal on the earnings call:

“AI is becoming a memory-first industry—because models and agents need longer ‘thinking’ time and more context retention.”

Technical Logic

Token Throughput = HBM Size × HBM Bandwidth

Longer agent thinking → Larger context windows → KV Cache bloat → Exponential HBM demand growth

When models scale from 7B to 70B parameters, and context windows from 8K to 128K, HBM demand grows far beyond linear.

SanDisk’s AI Reversal

SanDisk’s earnings also validate this trend:

Last year: loss of $0.30 per share; this quarter: $23.41 per share (vs. $14.50 estimate)
Revenue: $5.95B (vs. $4.70B estimate)
5 AI companies signed long-term supply agreements

The storage industry has reversed from losses to windfall profits driven by AI demand.

Landscape Judgment

Short-term Impact (2026)

HBM supply tightness will persist throughout the year, driving up GPU inference costs
Model optimization will increasingly focus on memory efficiency: quantization, MoE, KV cache compression
Domestic alternatives (e.g., CXMT) will receive policy acceleration

Medium-term Trends (2027-2028)

HBM4 standard release may ease some supply pressure
CXL memory pooling technology may change memory allocation paradigms
“Compute-in-memory” chip architectures may become a new competitive dimension

Investment Logic

Track	Certainty	Upside	Representative Targets
HBM manufacturers	★★★★★	★★★☆☆	SK Hynix, Micron
GPU vendors	★★★★☆	★★★★☆	NVIDIA, AMD
Data center REITs	★★★★☆	★★☆☆☆	Data center real estate funds
Memory optimization software	★★★☆☆	★★★★★	Quantization/compression toolchains

Action Recommendations

For AI Application Teams

Immediately evaluate your model’s memory usage efficiency; prioritize frameworks supporting quantized inference
Consider MoE architecture models for significantly reduced HBM demand at equivalent performance
Watch KV cache optimization techniques (PagedAttention, FlashDecoding)

For Hardware Procurement

HBM supply tightness may last 12-18 months; consider locking in supply contracts early
Evaluate AMD MI series as an NVIDIA alternative (better price-performance in some scenarios)

For Developers

Learn model quantization techniques (INT4/INT8) to run larger models on limited hardware
Watch memory optimization updates in local inference frameworks like llama.cpp and MLX