Xiaomi's new AI model uses 60% fewer tokens than GPT-5.4

Xiaomi has released a pair of open-source AI models that challenge the token efficiency of leading systems from OpenAI and Google, signaling a major shift in the economics of agentic AI.

Chinese technology firm Xiaomi released two open-source AI models, MiMo-V2.5 and V2.5-Pro, that consume up to 60% fewer tokens than models from OpenAI and Google for complex agentic tasks. The flagship MiMo-V2.5-Pro model, released under a permissive MIT license, aims to lower the cost of AI development as the industry shifts toward usage-based pricing.

"A model's value isn't measured by rankings alone — it's measured by the problems it solves," Fuli Luo, the project lead at Xiaomi MiMo and a former member of the DeepSeek team, said on the social media platform X.

According to benchmarks published by Xiaomi, the MiMo-V2.5-Pro achieves a 63.8% success rate on the ClawEval benchmark while consuming only around 70,000 tokens. This is 40-60% fewer than Anthropic's Claude Opus 4.6, Google's Gemini 3.1 Pro, and OpenAI's GPT-5.4 require for similar results. The model is available via API at a competitive $1.00 per million input tokens.

The launch directly challenges the economic model of closed-source AI leaders like OpenAI, Anthropic, and Google by offering a lower-cost, open-source alternative. As services like GitHub Copilot move to metered billing, MiMo's high efficiency and permissive license could attract developers and enterprises looking to avoid escalating "SaaS tax" on AI workflows.

Efficiency Through Specialization

At the core of the MiMo series is a Sparse Mixture-of-Experts (MoE) architecture. The 1.02 trillion-parameter MiMo-V2.5-Pro only uses 42 billion of its parameters for any given task, a design that significantly reduces computational cost. This allows the Pro model to achieve its high performance on agentic "claw" tasks—where an AI agent completes complex workflows on a user's behalf—while consuming substantially fewer tokens than its peers. On the GDPVal-AA benchmark, the Pro model scored 1581, outperforming models like Zhipu's GLM 5.1 and Moonshot's Kimi K2.6.

Xiaomi demonstrated the model's power by having it autonomously complete several complex projects. V2.5-Pro implemented a complete compiler in the Rust programming language in 4.3 hours over 672 tool calls and produced an 8,192-line video editor application in 11.5 hours.

An Open-Source Challenge to the 'SaaS Tax'

By releasing the models under the permissive MIT License, Xiaomi allows any developer or enterprise to use, modify, and deploy them commercially without restriction. This move is a direct challenge to the "walled garden" approach of many top AI labs and arrives as the industry's pricing models are changing. With GitHub Copilot and other services moving from flat-rate subscriptions to usage-based billing, the cost of running powerful AI agents is becoming a significant concern for enterprises.

Xiaomi's API pricing further undercuts the market. The MiMo-V2.5-Pro costs $1.00 per million input tokens and $3.00 for output, compared to $2.00 and $12.00 for Google's Gemini 3 Pro or $5.00 and $30.00 for OpenAI's GPT-5.5. To accelerate adoption, the company also announced a 100-trillion free token grant for developers.

The launch reflects a broader trend of highly capable open-source models from Chinese firms, including Alibaba's Qwen series and Zhipu AI's GLM, gaining ground on Western competitors. TIME recently named ByteDance, Zhipu AI, and Alibaba to its list of the 10 most influential AI companies of 2026, underscoring the growing influence of Chinese technology in the global AI landscape. For US enterprises, running these open-source models on private servers offers a path to leverage low-cost, high-performance AI while mitigating data residency and compliance risks.

This article is for informational purposes only and does not constitute investment advice.