Search Agent Sky ← Back to search
Query
Why did Xiaomi MiMo-v2.5 price drop 99%? — start with https://platform.xiaomimimo.com/docs/en-US/welcome and browse beyond it.
Xiaomi's MiMo-V2.5 price drop of up to 99% is primarily driven by **significant technical optimizations in inference efficiency** and **strategic market positioning** rather than a simple discount. Here are the key reasons: ## Technical Optimizations - **SGLang HiCache Integration**: Reduced data movement across the cache hierarchy (GPU VRAM, CPU RAM, SSD) to **one-seventh of previous levels** while increasing cacheable token capacity **fivefold**. - **Sliding Window Attention (SWA)**: Interleaved with Global Attention to reduce KV cache usage while preserving long-context capability (1M token window). - **Hybrid Attention Architecture**: Combines local sliding window attention with global attention for optimal performance. - **Multi-Token Prediction (MTP)**: 3-layer module accelerates decoding speed. ## Market & Strategic Reasons 1. **Competitive Pressure**: DeepSeek V4-Pro's permanent 75% discount forced Xiaomi to match pricing cent-for-cent ($0.0036 per million cached-input tokens for both). 2. **Developer Ecosystem Growth**: Xiaomi prioritizes developer habit formation over near-term API margins, increasing Token Plan quotas **5-8×** while resetting usage. 3. **Ecosystem Strategy**: MiMo serves as a strategic wedge across Xiaomi's hardware ecosystem (phones, vehicles, robotics) rather than a standalone profit center. 4. **Market Leadership**: The 100-trillion-token "MiMo Orbit" incentive program concluded successfully, demonstrating market demand. ## Pricing Impact - **MiMo-V2.5-Pro**: $0.0036/million cached-input tokens, $0.435/cache misses, $0.87/output tokens - **Base MiMo-V2.5**: $0.0028/million cached input, $0.28/output (matching DeepSeek V4-Flash economics) ## Limitations While the technical optimizations are substantial, Western enterprise adoption faces **geopolitical barriers** including data residency concerns and cybersecurity considerations, despite the models being open-source (MIT licensed). The price drop reflects a broader industry trend where inference is becoming a **deflationary commodity**, with value shifting to specialized agentic workflows and multi-model routing strategies.

Sources visited

1navigatehttps://platform.xiaomimimo.com/docs/en-US/welcome
2clickundefined
3ddm--text
4clickundefined
5clickundefined
6get_text
7navigatehttps://platform.xiaomimimo.com/token-plan
8searchXiaomi MiMo-V2.5 price drop 99% May 2026
9clickundefined
10clickundefined
11navigatehttps://www.weste.net/2026/05-27/mimo-v2.5-price-cut.html
12ddm--text
13searchXiaomi MiMo V2.5 price drop reasons technical improvements inference optimization
14clickundefined
15clickundefined
16navigatehttps://www.ctol.digital/news/xiaomi-slashes-mimo-ai-prices-99-how-the-deepseek-parity-signals-structural-inference-deflation/
17ddm--text --find technical
18searchXiaomi MiMo V2.5 technical improvements SGLang HiCache Sliding Window Attention
Shared by Proto Státis · May 27, 2026

Ask your own question

Try this exact query
Try the full platform at unchainedsky.com