Xiaomi MiMo-V2.5 Dual Models Open-Sourced: 1T MoE + 310B MoE, Million-Token Context, 100T Token Incentive Program

Key Takeaways

Xiaomi open-sourced two large language models in late April 2026, using MoE architecture spanning 1T and 310B parameter scales, both supporting million-token context windows. The simultaneous MiMo Orbit developer incentive program—up to 1.6 billion free tokens—directly competes with domestic vendors’ developer subsidy strategies.

Model Specifications & Architecture

Dimension	MiMo-V2.5-Pro	MiMo-V2.5
Total Parameters	1T	310B
Active Parameters	42B	15B
Context Window	1M Tokens	1M Tokens
Architecture	MoE	MoE
License	MIT	MIT
Positioning	Complex Agent + SE	Multimodal Agent
Commercial Use	✅ No extra license	✅ No extra license

Three-pillar architecture supporting trillion-parameter sparse + million-length context:

Hybrid Attention: Combines sliding window and global attention for efficiency at million-token scale
Sparse MoE Routing: Only 42B of 1T total parameters activated, keeping inference costs manageable
Long-Context Optimization: KV Cache management and attention decay specifically optimized for 1M token scenarios

Comparison with Competing Open Models

Model	Total Params	Active Params	Context	License
MiMo-V2.5-Pro	1T	42B	1M	MIT
Kimi K2.6	1T	32B	1M	Open Source
DeepSeek-V4	1.6T	49B	-	Open Source
Qwen 3.6	Various	-	-	Apache 2.0

MiMo-V2.5-Pro’s active parameters (42B) are close to Kimi K2.6 (32B), with comparable total parameters. On the Intelligence Index, MiMo V2.5 Pro scores ~54, behind Kimi K2.6 but the gap is narrow, both trailing GPT-5.5 (60 points).

100T Token Incentive: Competing for Developer Ecosystem

Xiaomi’s MiMo Orbit developer incentive program offers free tokens to global AI developers:

Maximum quota: 1.6 billion tokens
Review mechanism: Automatic review based on GitHub activity and AI usage history
Approval speed: Users report ~1 minute approval time
Target audience: High-quality AI application developers

This strategy mirrors Baidu and Moonshot’s developer subsidies—exchanging free compute for ecosystem lock-in and model feedback.

Luo Fuli’s Leadership: From DeepSeek to Xiaomi

The MiMo series is led by Luo Fuli (former Alibaba DAMO Academy, DeepSeek core member). In a 3.5-hour technical interview, she revealed:

Pre-train gap nearly closed: Domestic top teams are rapidly closing the gap with Anthropic in pre-training
Competition shifting to Agent RL: Next-gen model capabilities hinge on Agent reinforcement learning, not just pre-training scale
Open source is essential: Rapid community feedback and real-world data acquisition through open source

Action Recommendations

Scenario	Recommendation	Rationale
Local Agent deployment	MiMo-V2.5 (15B active)	Low active parameters, reduced VRAM needs
Complex coding tasks	MiMo-V2.5-Pro	Designed for software engineering, 1M context
Commercial applications	Either	MIT license, no extra authorization
Developer testing	MiMo Orbit free quota	Zero-cost model evaluation

MiMo-V2.5’s significance extends beyond parameters—Xiaomi enters open-source LLM competition as a hardware manufacturer. Combined with Xiaomi’s hardware ecosystem (phones, cars, IoT), MiMo has unique edge-cloud synergy potential.

Key Takeaways

Model Specifications & Architecture

Comparison with Competing Open Models

100T Token Incentive: Competing for Developer Ecosystem

Luo Fuli’s Leadership: From DeepSeek to Xiaomi

Action Recommendations

Related

OpenAI GPT-6 "Goblin" Roadmap Leaked: September 29 DevDay Announcement, AGI Timeline Reignites Debate

Kimi Uses DeepSeek Architecture, DeepSeek Uses Kimi Optimizer: China Models' Open Symbiosis Model

Mistral Medium 3.5 Released: 128B Params, 256K Context, with Workflows Enterprise Orchestration Layer