Tencent's Hy3 AI model boosts coding skill 40% to rival Claude

Tencent Holdings Ltd. released its most capable large language model yet, with benchmark scores showing a 40% generational improvement in coding that puts the model in direct competition with rivals from Anthropic and Google at a fraction of the cost.

"The model was built to balance three things: capability breadth, honest evaluation, and cost-efficiency," Tencent said in a statement accompanying the release. The company open-sourced the model weights and is offering API access on its cloud platform.

The new model, Hy3 preview, is a 295 billion-parameter Mixture-of-Experts (MoE) system that keeps only 21 billion parameters active during inference. On the SWE-bench Verified coding test, which evaluates a model’s ability to fix real-world bugs from GitHub, Hy3 scored 74.4%, a dramatic jump from the 53.0% achieved by its predecessor. This places it ahead of competitors like GLM-5 (77.8%) and Kimi-K2.5 (76.8%) and within striking distance of Anthropic's Claude Opus 4.6 (80.8%).

The release marks a strategic pivot for Tencent toward commercially viable AI, with the model’s pricing and architecture designed for large-scale deployment. Citigroup analysts, who maintained their Buy rating and HK$783 price target on Tencent, called the model's focus on balancing quality, speed, and cost the "correct strategic direction" for enterprise adoption. The pricing, at approximately $0.18 per million input tokens, is roughly 90% cheaper than comparable GPT-4-class models.

A Focus on Commercial Viability

Tencent is explicitly targeting the enterprise market by co-designing the model and its inference framework to prevent capability gains from pricing the model out of mass deployment. The MoE architecture, which routes queries to specialized sub-networks, is key to this strategy, significantly lowering compute costs per query. The company noted that its previous flagship model had over 400 billion parameters, a number it deliberately walked back to find an optimal balance between reasoning maturity and cost.

The model is already integrated into more than ten of Tencent’s own products, including Yuanbao, QQ, and Tencent Docs. Within internal applications like CodeBuddy and WorkBuddy, the company reported first-token latency dropped 54% and end-to-end generation time fell 47%, demonstrating the model's stability in production environments for complex agent workflows.

Infrastructure Overhaul Enables Speed

The Hy3 preview model went from a cold start to an open-source release in under three months, a timeline Tencent attributes to a complete overhaul of its pre-training and reinforcement learning stack in February. Led by Chief AI Scientist Yao Shunyu, the rebuild was guided by a principle of integrating the model development loop directly with product teams to shape training priorities with live metrics.

This tight integration of model and product gives Tencent a data flywheel that few competitors can match, allowing the company to convert real-world user interaction into rapid model improvements. While Hy3 still trails the absolute frontier models from OpenAI and Google DeepMind on some benchmarks, its performance-per-cost ratio makes it a formidable new entrant in the AI infrastructure race.

This article is for informational purposes only and does not constitute investment advice.