DeepSeek, Xiaomi Slash AI Prices 99% as US Labs Raise Costs
DeepSeek and Xiaomi dramatically reduced AI API pricing with permanent discounts up to 99%, driven by architectural efficiencies, while OpenAI and Anthropic have moved to higher or hidden costs, reshaping AI economics.
Quick Take
DeepSeek locks in 75% V4-Pro discount at $0.87/M tokens.
Xiaomi slashes MiMo-V2.5 cached input to $0.0036/M tokens.
OpenAI's GPT-5.5 doubles output price to $30/M tokens.
New tokenizer inflates Anthropic's Claude Opus 4.7 costs by 35%.
Market Impact Analysis
NeutralArticle focuses on AI API pricing; no direct crypto market implications.
Speculation Analysis
Key Takeaways
- DeepSeek locked in a permanent 75% discount on V4-Pro, outputting at $0.87 per million tokens.
- Xiaomi slashed MiMo-V2.5 cached input pricing to $0.0036 per million tokens—a 99% cut.
- OpenAI doubled GPT-5.5 output costs to $30 per million tokens at launch, moving opposite to Chinese labs.
- Anthropic's updated tokenizer quietly inflates Claude Opus 4.7 actual costs by up to 35%.
What Happened
DeepSeek and Xiaomi just reset AI economics. On May 22, DeepSeek made its 75% V4-Pro discount permanent, locking output at $0.87 per million tokens. Four days later, Xiaomi slashed MiMo-V2.5 prices by up to 99% for cached inputs, bringing the cost to $0.0036 per million tokens for the Pro model. These cuts arrive as US labs move in the opposite direction. OpenAI's GPT-5.5 launched with output prices doubled to $30 per million tokens. Anthropic's Claude Opus 4.7 shipped with an updated tokenizer that inflates actual costs as much as 35% without warning. The gap between what Chinese and American AI models cost developers has never been wider.
The Numbers
DeepSeek V4-Pro now charges $0.87 per million output tokens—down from $3.48 before the discount. Xiaomi's MiMo-V2.5 cached input rate is $0.0036 per million tokens, a 99% drop. For perspective, Xiaomi's $100 Max plan now buys 82 billion tokens, up from 1.6 billion. OpenAI's GPT-5.5 costs $30 per million output tokens, double its predecessor. Anthropic's tokenizer change effectively adds up to 35% to bills, as more tokens are consumed for the same text. KV cache optimizations drove the Chinese cuts, slashing compute costs by around 80% according to Xiaomi's technical lead.
Why It Happened
Chinese labs achieved these cuts through architectural efficiency, not subsidized loss-leading. Xiaomi's MiMo team implemented hierarchical KV cache optimization for sliding window attention. This allows the model to store and reuse five times more previously processed data, slashing redundant compute. The result: an 80% reduction in storage and processing costs. DeepSeek's V4-Pro benefits from similar attention architecture breakthroughs. Meanwhile, US labs rely on brute-force scaling and premium positioning. The tokenizer inflation from Anthropic suggests a search for revenue in a market where unit economics are under pressure.
Broader Impact
The pricing divergence could accelerate AI adoption in cost-sensitive markets and force a pricing war. Efficient architectures may become standard as developers flock to cheaper options. China's AI labs are now competing on cost, not just capability, challenging the dominance of US models. This shift mirrors historical tech cycles where commoditization follows innovation. Watch for downstream effects on AI startups that built their unit economics on pricier US APIs.
What to Watch Next
- Whether OpenAI and Anthropic respond with price cuts or double down on premium features to justify costs.
- Adoption metrics for Chinese models among Western developers—will performance gaps matter less than savings?
- Xiaomi's capacity: the team says servers are running near full load at reduced prices; can supply meet demand?
This article is for informational purposes only and does not constitute financial advice.
Always late to trends?
Join for the latest news, insights & more.
Disclaimer: Bytewit is an independent media outlet that delivers news, research, and data.
© 2026 Bytewit. All Rights Reserved. This article is for informational purposes only.