DRAMeXchange : Weekly Research : 【Market View】

【Market View】Aggressive Procurement of NVIDIA GB and Rubin Racks by North American CSPs Will Drive a 1.2x Surge in AI Inference Computing Power in 2026, Says TrendForce


Published 2026-05-20 (GMT+8)

Aggressive Procurement of NVIDIA GB and Rubin Racks by North American CSPs Will Drive a 1.2x Surge in AI Inference Computing Power in 2026, Says TrendForce

According to the latest AI sector research by global market intelligence firm TrendForce, the top five North American cloud service providers (CSPs) will significantly increase their procurement of rack-scale AI servers in 2026 to expand the deployment of AI training and inference models. These five companies are expected to account for over 60% of global demand for NVIDIA’s GB and VR series servers this year. Moreover, TrendForce estimates that this aggressive infrastructure expansion will boost their combined AI training computing power by more than 56% YoY, while driving a massive YoY surge of approximately 122% in their combined AI inference computing power.

TrendForce projects a YoY growth of over 28% for global AI server shipments in 2026. High-end AI training servers will continue to lead the market, accounting for about 55% of this year’s total shipments. Over the medium to long term, however, AI inference servers are projected to overtake training servers as CSPs rapidly roll out AI applications to commercialize related AI cloud services. On the other hand, NVIDIA is set to expand its AI inference solutions and use cases. For instance, when promoting its flagship AI servers for this year, the GB and VR systems, NVIDIA explicitly emphasizes that these systems support AI inference workloads in addition to AI training.

TrendForce estimates that the combined capital expenditures of Google, Amazon, Microsoft, Meta, and Oracle will exceed US$770 billion in 2026, marking a YoY increase of nearly 87%. An analysis of the computing power that the top five North American CSPs will acquire through their purchases of NVIDIA GB and VR servers reveals significant growth. Based on FP16 and BF16 estimates for AI training, their combined computing power is projected to exceed 9 ExaFLOPS in 2025 and grow by more than 56% in 2026.

When evaluating AI inference capabilities based on FP4/NVFP4 performance metrics, the combined computing capacity of the top five CSPs is estimated to have exceeded 37 ExaFLOPS in 2025. This figure is expected to surge by nearly 122% in 2026, growing at a significantly faster rate than AI training. This reflects NVIDIA’s specific focus on optimizing AI inference performance across its hardware and software systems, as demonstrated by the new generation of GB300 and VR200 rack-scale solutions.

Alongside GPU-based solutions, CSPs are steadily advancing rack products equipped with their in-house ASICs, with Google leading the charge. TrendForce projects that Google’s demand for its proprietary TPU chips will grow in volume by nearly 80% YoY in 2026, with a gradual transition from the v7 to the v8 generation in the second half of the year. Amazon ranks second only to Google in the in-house ASIC push, with TrendForce projecting that its Trainium series will account for over 40% of Amazon’s own AI server shipments in 2026.

TrendForce notes that the latest generation of server racks featuring NVIDIA and AMD GPUs, along with CSPs’ in-house ASICs, have all adopted liquid cooling systems. This helps reduce the U-count (or server height), thereby enabling a single AI server rack to accommodate more accelerators. Nevertheless, as the thermal design power (TDP) continues to climb for individual AI GPUs and ASICs, the overall system power consumption of AI servers is seeing a structural increase.

TrendForce estimates that in 2023, the total power consumption of all servers operated by the top five North American CSPs grew by 2.8GW compared to the previous year. In 2026, their total server power consumption is projected to surge by 18 GW compared to the previous year, with the YoY growth rate expected to hit a staggering 116%. This sharp rise is primarily driven by the intensifying AI arms race, leading to the simultaneous mass deployment of platforms such as NVIDIA’s GB300, AMD’s Helios, and various ASICs developed by CSPs.


About DRAMeXchange

DRAMeXchange is a global primary provider of future intelligences, in-depth analysis reports and advisory services on DRAM and Flash memory industry with coverage including current business, spot trading prices, and market trends, capital spending and wafer capacity trends, the impact of DRAM/flash memory products on the market, and other relevant PC industry information.

© DRAMeXchange ® Tech.Inc. All rights reserved.