On April 24, DeepSeek released the V4 series of models—the flagship V4-Pro with 1.6 trillion parameters and the efficient V4-Flash with 284 billion parameters. But more important than the models themselves: this is the first domestic large model based on Huawei Ascend chips from the training phase onward.
Key Metrics
| Metric | Value |
|---|---|
| V4-Pro Total Parameters | 1.6T, 49B activated |
| First Token Latency | 20ms |
| Inference Compute Consumption | Only 27% of the previous generation V3.2 |
| Ascend 950 Single-Card Throughput | 4700 TPS (8k input) |
| FP4 Compute Power | Ascend 950PR reaches 1.56P, 2.87x that of H20 |
| Procurement Cost | Only 1/3 to 1/4 of H200 |
From “After-the-Fact Adaptation” to “Native First Release”
Previous domestic models were all first trained in NVIDIA’s CUDA ecosystem, then spent months migrating to the Ascend CANN framework. This time, DeepSeek V4 was trained directly on the Ascend 950, with Huawei announcing full compatibility across the entire Ascend supernode series within hours.
This means domestic computing power has gone from a “backup option” to a “primary choice.”
Leapfrog Breakthrough in Agent Capabilities
V4-Pro achieves a leapfrog improvement in Agent capabilities, with a coding experience that surpasses Sonnet 4.5 and delivery quality approaching Opus 4.6. It also launches “Fast Mode” and “Expert Mode,” and has started a phased rollout of image recognition mode.
Signal to the Industry
When the largest open-source model vendor and the largest domestic chip vendor are deeply integrated, the entire ecosystem’s flywheel starts spinning. Following the announcement, shares of domestic AI chip concept stocks surged over 10% on the day.
Primary Sources: Toutiao, chinaz, Bilibili Ascend Livestream