- Google Cloud signed a multi-billion-dollar AI infrastructure deal with Mira Murati’s Thinking Machines Lab, the startup’s first cloud partnership.
- The AI Hypercomputer with Nvidia GB300 NVL72 GPUs delivers 2X faster training and serving speeds compared to prior-generation hardware.
- TML’s Tinker fine-tuning platform now runs on Google Kubernetes Engine with near-instant weight transfers via the Jupiter network.
Google Cloud landed its first cloud infrastructure customer among the new generation of ex-OpenAI startups, inking a multi-billion-dollar agreement with Thinking Machines Lab. The deal, unveiled at Google Cloud Next in Las Vegas on April 22, 2026, gives Thinking Machines access to Google’s AI Hypercomputer stack—featuring Nvidia GB300 NVL72 GPUs based on the Blackwell architecture—faster training and serving speeds, and integrated services from GKE to Cloud Storage.
Thinking Machines Lab, founded in February 2025 by former OpenAI Chief Technologist Mira Murati and valued at $12 billion following a $2 billion seed round, had previously partnered with Nvidia directly. This cloud agreement with Google marks the startup’s first formal infrastructure provider relationship, placing it alongside Anthropic’s dual Google/AWS capacity deals in the intensifying AI compute arms race.
The technical integration spans Google’s full AI stack: Thinking Machines runs reinforcement learning workloads on A4X Max VMs, uses Spanner for transactional metadata, accesses petabyte-scale Cloud Storage buckets, and orchestrates via Google Kubernetes Engine. Cluster Director provides automated remediation when preemptible instances terminate mid-training—a critical capability for cost-efficient frontier model development.
Why Tinker Benefits From Google’s Integrated Stack
Tinker, Thinking Machines’ fine-tuning and frontier model creation platform launched in October 2025, depends on rapid iteration cycles and reliable checkpoints. Google’s interconnected services reduce operational overhead—the Jupiter network enables near-instantaneous weight transfers between GPU nodes, while Anywhere Cache keeps frequently accessed model shards within nanoseconds of compute. “This seamless integration of high-performance compute, fast storage, GKE orchestration, and automated remediation via Cluster Director has allowed us to focus on the unique aspects of the stack like Tinker and reinforcement learning,” said Myle Ott, Founding Researcher at Thinking Machines Lab.
The 2X performance gain over prior-generation Nvidia hardware stems from Blackwell’s NVLink 10 interconnect, doubling bandwidth between GPUs within a single node. For reinforcement learning workloads that continuously generate and validate new training data—such as the kind TML runs to push frontier capabilities—this reduction in I/O latency translates directly into faster iteration cycles.
Non-exclusivity defines the deal’s shape. Thinking Machines retains the right to provision capacity from other clouds alongside Google, a standard term for well-capitalized AI startups that want to avoid dependency lock-in. Anthropic pursued a similar dual-route strategy, simultaneously signing capacity agreements with Google for TPU v5e pods and Amazon Web Services for up to 5 gigawatts of compute.
The Cloud AI Infrastructure Battlefield Heats Up
Google Cloud is aggressively bundling infrastructure to make switching costs prohibitive. A Thinking Machines engineer can spin up a Kubernetes cluster with pre-installed PyTorch containers connected to Spanner and Cloud Storage in under ten minutes—no manual configuration required. The sales pitch stresses reliability and throughput over raw price; spokespeople note the AI Hypercomputer achieved 98.7% uptime across Q1 2026 despite preemptible VM churn.
Competition among the big three clouds is now measured in reinforcement learning compute deals rather than general-purpose instances. Amazon’s $1.5 billion Anthropic tab now sits alongside its new deal with Thinking Machines for specialized Spot Instance fleets optimized for long-running RLHF training jobs. Microsoft Azure is pitching a competitive hybrid offering that combines on-prem Nvidia DGX systems with Azure-hosted GB300 clusters through its restricted data residency zones in Europe.
For frontier AI labs, infrastructure choices increasingly reflect strategic risk tolerance—not just cost metrics. Moving massive Tinker fine-tuning workloads off Google would require re-architecting checkpointing and orchestrating data transfers across a different network fabric, introducing operational friction that could slow research velocity by weeks. “The team at Thinking Machines Lab is generating very exciting research and product offerings that will help organizations more effectively utilize AI,” wrote Mark Lohmeyer, VP & GM of AI and Computing Infrastructure at Google Cloud.
The partnership positions the Nvidia GB300 NVL72 rack-scale system as the de facto hardware standard for 2026 frontier AI workloads, with Google Cloud betting its integrated services will differentiate the offering. Thinking Machines gains access to the world’s largest privately owned AI training clusters without building its own physical infrastructure—preserving capital for research talent and Tinker product development instead of data center operations.
The deal was negotiated over six weeks and includes annual commitment-based pricing with graduated discounts starting at the 10,000 GPU-month threshold, according to sources familiar with the terms. Volumetric disclosures remain confidential, though TechCrunch reports the agreement runs in the single-digit billions over three years.
