Goldman Sachs projects a 24-fold surge in AI token use

A new report from Goldman Sachs argues the artificial intelligence industry is approaching a critical inflection point where falling compute costs will unlock a 24-fold surge in profitable demand for AI-generated tokens by 2030, providing a powerful counter-narrative to fears over unsustainable capital spending by tech giants.

"The AI industry is moving from a phase of uncertain inference economics that could dilute margins to a new stage where incremental token growth is accretive," Goldman Sachs analysts wrote in the May 5 report. The bank suggests this profitability拐点 (guǎidiǎn, inflection point) could arrive within the next three to 12 months.

The core of the argument rests on a "scissors" chart showing the divergence between the cost to produce AI and the price charged for it. While the pricing for mainstream large models has stabilized after steep declines, the underlying cost to compute each token—powered by chips from Nvidia, AMD, and Google—continues to fall by 60 to 70 percent annually, according to Goldman's analysis. This widening gap creates a durable profit margin for providers like Amazon's AWS and Google Cloud.

This analysis reframes the debate around the massive infrastructure spending from hyperscalers. While firms like Microsoft and Meta are spending over 100 percent of their operating cash flow on AI capital expenditures, Goldman's report argues that the coming wave of profitable token consumption makes these investments economically sustainable, directly challenging the bear case that enterprise AI has yet to show a return on investment.

The AI Agent Economy

The engine for this growth is what Goldman calls the "AI Agent Economy," where autonomous software agents drive a massive increase in compute usage. The bank estimates these agents will expand global token consumption from current levels by 24 times by 2030, and by 55 times by 2040, as they become integrated into business workflows.

Enterprise agents are the most significant factor, projected to account for over 70 percent of all token usage by 2040. Unlike simple chatbots, these agents perform complex, multi-step tasks that are far more token-intensive. Goldman's model shows a programming agent could consume 7 million tokens per day, while a data entry agent might use 25 million. At current API prices, the cost of these agents remains well below the cost of human labor for the same tasks, creating a clear economic incentive for adoption.

Consumer-facing agents are also expected to drive a 12-fold increase in token use by 2030. The key shift occurs when agents move from on-demand tasks to "always-on" background monitoring of emails, calendars, and other data streams. A simple chatbot query might use 1,000 tokens, but a persistent assistant could exceed 100,000 tokens daily.

Investment Implications

The report's primary conclusion is that improving profit margins will sustain the high levels of infrastructure investment from hyperscalers. Goldman reiterated its positive view on Amazon, citing AWS's revenue re-acceleration and $364 billion backlog, and Google, noting its cloud division's 63 percent growth and $460 billion backlog.

For the broader market, this thesis provides a justification for the high valuations of companies enabling the AI buildout. If the unit cost of intelligence continues to fall, the total addressable market for compute is likely to grow faster than the cost per unit declines—a pattern seen in previous technology shifts like cloud computing and mobile data. This supports a long-term bullish outlook for semiconductor firms like Nvidia and the cloud platforms that deploy their chips.

The report suggests that investors should shift their focus from questioning the cost of AI to analyzing the new business models that emerge as the cost of intelligence approaches zero.

This article is for informational purposes only and does not constitute investment advice.