In late April 2026, the AI community noticed a remarkable phenomenon: Kimi K2.6’s underlying architecture inherits DeepSeek v3’s design, while DeepSeek V4’s training optimizer originates from Kimi/Moonshot team’s Muon. This is not mere “borrowing” — it’s a technology cycle based on open-source licenses.
Conclusion First
Chinese open-source models are forming a unique competitive model — open symbiosis. Two companies independently chose open-source routes, absorbing each other at the architecture level and contributing at the optimization level, ultimately reaching closed-source performance levels together at only 1/8 of the training cost.
Technical Breakdown
Kimi K2.6 → Inherits DeepSeek v3 Architecture
| Dimension | DeepSeek v3 Architecture | Kimi K2.6 Evolution |
|---|---|---|
| Parameters | 671B total, 37B active | Expanded to 1.6T |
| Context Window | 128K | Public 256K, hardware supports 1M |
| Inference Efficiency | MLA reduces KV Cache | Plus proprietary scheduling |
| Agent Capability | Basic tool calling | Leading in HLE, DeepSearchQA |
DeepSeek V4 → Adopts Kimi’s Muon Optimizer
DeepSeek V4 introduced the Muon optimizer in training — originally developed by Kimi/Moonshot AI team.
- More efficient gradient updates: More stable convergence under MoE than traditional AdamW
- Lower VRAM usage: Smaller optimizer state allows larger batch size
- Domestic chip compatibility: Better adaptation on Huawei Ascend NPU
Performance Comparison
| Model | Score | Params | Context | API Cost (vs GPT-5.5) |
|---|---|---|---|---|
| Kimi K2.6 | 73 | 1.6T | 256K-1M | ~1/8 |
| DeepSeek V4 Flash | 73 | — | 1M | ~1/8 |
| DeepSeek V4 Pro | 73 | — | 1M | ~1/10 |
| Gemma 4 31B | 72 | 31B | 128K | ~1/5 |
| Qwen3.6 27B | 71 | 27B | 128K | ~1/6 |
Key observation: Top 3 — Kimi K2.6, DeepSeek V4 Flash/Pro — all score 73 and tie for first place. Their API costs are only 1/8 to 1/10 of GPT-5.5.
Why This Model Is Unique
Comparison with Western Open-Source Ecosystem
| Dimension | China Model (Kimi↔DeepSeek) | Western Model (Meta Llama) |
|---|---|---|
| Innovation Source | Multi-company cross-contribution | Single company dominated |
| Open-Source Strategy | Architecture-level open | Weight-level open |
| Competitive Relationship | Symbiosis + competition | Pure competition |
| Ecosystem Effect | Technology cycle acceleration | Single-model ecosystem |
Risks
- Technology homogenization: If everyone uses similar architectures, differentiation gets harder
- License dependency: This symbiosis relies on both parties staying open-source
- Innovation ceiling: Cross-borrowing can achieve “catch up to closed-source” but “surpassing closed-source” may require entirely new architectures
Action Advice
| Your Scenario | Recommendation |
|---|---|
| Agent/tool calling | Prioritize Kimi K2.6 |
| Reasoning/math/coding | Prioritize DeepSeek V4 Pro |
| Cost control priority | DeepSeek V4 Flash, best cost-performance |
| Local deployment need | Qwen3.6 27B, runs on consumer hardware |
| Long-term tech selection | Watch if the two companies diverge architecturally |