Microsoft did not waste time making its position known. During Nvidia’s GTC event this week, CEO Satya Nadella confirmed the company brought up an Nvidia Vera Rubin NVL72 system for validation in a lab setting, staking a claim as the first cloud provider to reach that milestone. Full deployment across Azure data centers follows over the coming months, but the announcement itself reflects just how much weight this hardware generation carries across the entire cloud industry right now.
The Vera Rubin platform gives that urgency some context. Nvidia positions it as a significant leap beyond Blackwell, citing roughly five times the inference performance and three and a half times the training performance compared to the previous generation. The NVL72 rack-scale system combines 36 CPUs with 72 GPUs, operates entirely on liquid cooling, and ships with a cable-free modular tray design that trims installation time from two hours down to approximately five minutes. At the scale these providers operate, that kind of deployment efficiency compounds quickly.
Google Cloud confirmed its own Vera Rubin plans shortly after, targeting NVL72 availability in the second half of 2026 through its AI Hypercomputer infrastructure platform. AWS took a wider view, announcing it will deploy over one million Nvidia GPUs across Blackwell and Rubin architectures within the next 12 months. Neither company is treating this as a distant roadmap item.
Smaller and mid-tier providers are moving just as deliberately. CoreWeave announced Rubin GPU deployment plans in January 2026 and additionally intends to offer Vera as a standalone platform. Lambda confirmed NVL72 deployments around the same time, also targeting second-half 2026 availability. Vultr plans an optimized inference stack built on the Rubin platform, designed for deployment across public, private, and sovereign cloud environments.
On the neocloud side, Nebius secured a contract with Meta worth up to $27 billion that includes one of the first large-scale Vera Rubin deployments. Nscale, operating under a separate Microsoft contract, plans a 1.35 gigawatt Vera Rubin cluster at its Monarch Compute Campus.
Vera Rubin entered full production at the start of 2026. The providers lined up behind it suggest that whatever comes next in AI infrastructure, this hardware sits near the center of it.
