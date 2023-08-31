Google has introduced its TPU v5e AI chip as a mainstream alternative to Nvidia GPUs. This new chip comes with a suite of software and tools for large-scale orchestration of AI workloads in virtual environments. It is currently available in preview to Google Cloud customers.

The TPU v5e succeeds its previous-generation TPUv4, which was used to train language models in Google search, mapping, and online productivity applications. Google has emphasized that it aims to offer a variety of AI chips to its customers, with the H100 GPUs from Nvidia in the A3 supercomputer and the TPU v5e for inferencing and training.

One notable feature of the TPU v5e is that it is the first Google AI chip available outside of the United States. It will be installed in the Netherlands for the EMEA markets and in Singapore for the Asia-Pacific markets. Despite the controversy surrounding its development, Google’s TPU v5e has become an essential part of the company’s data centers as it integrates AI features into its product lines.

Compared to its predecessor, the TPU v5e showcases better performance in terms of INT8 performance, offering a peak performance of 393 teraflops per chip. However, it falls short on BF16 performance, with 197 teraflops compared to the TPU v4’s 275 teraflops.

But what sets the TPU v5e apart is its scalability. While the TPU v4 could be configured in clusters of 4,096 chips, the TPU v5e can expand to hundreds or thousands more configurations, enabling the handling of larger training and inferencing models.

Google has also introduced a technology called “Multislice,” which allows users to easily scale AI models beyond the physical boundaries of TPU pods. This feature enables the networking of hundreds of thousands of AI chips together in a cluster.

To further optimize the TPU v5e, Google has finetuned its virtual machines so that chips can process multiple virtual machines simultaneously. The company has also introduced the Kubernetes service for Cloud TPU v5e and v4, aiding in the orchestration of AI workloads across the TPUs.

With regards to cost, the TPU v5e offers better performance compared to the TPU v4. For every dollar spent, the TPU v5e is up to two times faster in training and 2.5 times faster in inferencing. It is also priced at $1.2 per chip hour, while the TPU v4 costs around $3.2 per hour.

Overall, Google’s TPU v5e AI chip presents itself as a viable alternative to Nvidia GPUs. With its improved performance, scalability, and cost-efficiency, it opens up possibilities for organizations to train and deploy larger and more complex AI models.

Sources:

– [Google AI Blog](https://ai.googleblog.com/2021/09/tpu-v5-inference-and-training-next.html)

– [SemiAnalysis](https://semianalysis.com/google-tpu-v5-leaked-by-semianalysis/)