Google Unveils TPU 8t and 8i: Two Chips Built for the Agentic Era
Google split its eighth-generation TPU into a training giant and an inference workhorse, paired with a 134,000-chip Virgo Network fabric aimed squarely at agentic AI.
Google used Cloud Next 2026 to take the wraps off its eighth-generation TPU, and for the first time the company is splitting its custom AI silicon into two distinct chips: TPU 8t for training and TPU 8i for inference. Both were co-designed with Google DeepMind specifically for agentic workloads that demand iterative reasoning, multi-step workflows, and continuous learning loops.
TPU 8t is the new training monster. A single superpod scales to 9,600 chips connected by doubled interchip bandwidth, exposing 2 petabytes of shared high-bandwidth memory and 121 ExaFlops of compute. Google says it delivers nearly 3x the per-pod performance of the prior Ironwood generation while sustaining over 97% goodput, and the architecture supports near-linear scaling all the way up to one million chips for frontier-model training runs.
TPU 8i is tuned for low latency and cost. The chip ships with 288 GB of high-bandwidth memory and 384 MB of on-chip SRAM, enough to host larger key-value caches entirely on silicon. A new Boardfly topology cuts the maximum network diameter by more than 50%, doubled interconnect bandwidth lifts ICI throughput to 19.2 Tb/s, and a dedicated Collectives Acceleration Engine slashes on-chip latency by up to 5x. The headline number for cloud customers: 80% better performance per dollar than the previous generation, which Google says lets buyers serve nearly twice the customer volume at the same cost.
Both chips slot into Virgo Network, a new megascale data center fabric that knits up to 134,000 TPU 8t accelerators (or NVIDIA Vera Rubin NVL72 systems) into one supercomputer with up to 47 petabits per second of non-blocking bisectional bandwidth. The host complex moves to Google’s custom Axion Arm-based CPUs, fourth-generation liquid cooling sustains the higher densities, and customers get bare-metal access without a hypervisor in the path.
Software support arrives day one for JAX, MaxText, PyTorch, SGLang, and vLLM, with Google open-sourcing a Tunix reinforcement-learning stack. Citadel Securities is already named as an evaluation customer, and general availability is slated for later in 2026 — putting Google in direct competition with NVIDIA’s Vera Rubin roadmap as the agentic-AI buildout accelerates.