0G, in collaboration with China Mobile, trained a 107 billion parameter AI model using decentralized infrastructure, marking the first time a model above 100 billion parameters has been developed without centralized data center clusters.
"Decentralized training at this scale proves that large model development no longer requires exclusive access to hyperscale GPU farms," said Michael Heinrich, co-founder of 0G Labs. "Telecom providers sitting on underutilized compute capacity can now participate in the AI supply chain."
The model was trained using 0G's decentralized training framework, which incorporates the DiLoCoX method — a technique that can train models up to 357 times faster than previous decentralized approaches, even across networks with as little as one gigabyte of bandwidth, according to research from 0G Labs. By distributing the computational load across China Mobile's existing infrastructure rather than a single data center, the project bypassed the traditional bottleneck of centralized GPU clusters that have constrained AI development to a handful of hyperscalers.
Why decentralized training matters for enterprise AI
The achievement addresses a structural problem in the AI industry: training large models has required massive upfront capital expenditure on GPU clusters, locking out all but the wealthiest technology companies. Decentralized training flips that model by treating any network-connected compute as a potential training node. For telecom operators like China Mobile, which operates vast but often idle compute infrastructure across its network, this creates a new revenue stream from existing assets.
The approach also reduces dependency on Nvidia's H100 and B200 GPUs, which have faced supply constraints and export controls. By aggregating heterogeneous compute resources across a distributed network, 0G's framework can train models using a mix of hardware types rather than requiring uniform GPU clusters. This could ease pressure on the $200 billion data center GPU market, where lead times for Nvidia's latest chips have stretched beyond 12 months.
However, data readiness remains a barrier. Gartner estimates that as many as 60% of AI projects may be abandoned by 2026 because of fragmented or siloed data, a problem that decentralized training alone does not solve. Enterprises looking to adopt this approach must first unify their data infrastructure before benefiting from distributed compute.
Competitive implications for the AI infrastructure stack
The 0G-China Mobile milestone challenges the centralized training model championed by Nvidia and the major cloud providers. If decentralized training gains adoption, it could shift procurement patterns away from hyperscaler GPU-as-a-service offerings toward a more fragmented market where telecom operators and edge providers monetize spare capacity.
Bittensor and Render Network, two projects that tokenize compute resources, could see increased demand as enterprises explore decentralized alternatives. The ability to train models across distributed infrastructure also aligns with growing regulatory pressure in regions like the European Union and China, where data sovereignty requirements make centralized cross-border training difficult.
For investors, the development introduces a new variable into the AI infrastructure thesis. Nvidia's data center revenue, which reached $47.5 billion in its most recent fiscal year, has been built on the assumption that large model training requires concentrated GPU clusters. If decentralized methods prove viable at scale, the total addressable market for centralized AI compute could narrow, benefiting infrastructure providers that can aggregate distributed resources.
This article is for informational purposes only and does not constitute investment advice.