AMD has introduced its latest offering — the AMD Instinct MI300X accelerators, boasting unmatched memory bandwidth for generative AI and superior performance for Large Language Model (LLM) training and inferencing. Accompanying this release is the AMD Instinct MI300A accelerated processing unit (APU), integrating the latest AMD CDNA 3 architecture and “Zen 4” CPUs to deliver groundbreaking performance for high-performance computing (HPC) and AI workloads.
Leading the charge in adopting the new AMD Instinct accelerator portfolio is Microsoft, which has recently unveiled the Azure ND MI300x v5 Virtual Machine (VM) series, optimized for AI workloads and powered by AMD Instinct MI300X accelerators. Lawrence Livermore National Laboratory’s supercomputer, El Capitan, equipped with AMD Instinct MI300A APUs, is poised to become the second exascale-class supercomputer powered by AMD, promising over two exaflops of double precision performance upon full deployment.
“AMD Instinct MI300 Series accelerators are designed with our most advanced technologies, delivering leadership performance, and will be in large-scale cloud and enterprise deployments. By leveraging our leadership hardware, software and open ecosystem approach, cloud providers, OEMs and ODMs are bringing to market technologies that empower enterprises to adopt and deploy AI-powered solutions,” according to Victor Peng, President of AMD.
Oracle Cloud Infrastructure is set to enhance its high-performance accelerated computing instances for AI with the addition of AMD Instinct MI300X-based bare metal instances, supporting OCI Supercluster with ultrafast RDMA networking.
Several major original equipment manufacturers (OEMs) showcased their accelerated computing systems in conjunction with the AMD Advancing AI event. Dell presented the PowerEdge XE9680 server featuring eight AMD Instinct MI300 Series accelerators, while HPE announced the HPE Cray Supercomputing EX255a, the first supercomputing accelerator blade powered by AMD Instinct MI300A APUs. Lenovo also expressed its support for the new AMD Instinct MI300 Series accelerators, with plans for availability in the first half of 2024. Supermicro introduced new additions to its H13 generation of accelerated servers, featuring 4th Gen AMD EPYC CPUs and AMD Instinct MI300 Series accelerators.
AMD Instinct MI300X: A Leap in AI & HPC Performance
The AMD Instinct MI300X accelerators, driven by the cutting-edge AMD CDNA 3 architecture, outperform their predecessors with nearly 40% more compute units, 1.5x more memory capacity, and 1.7x more peak theoretical memory bandwidth. Tailored for AI and HPC workloads, these accelerators support new math formats such as FP8 and sparsity.
In response to the escalating demands of increasingly complex large language models, the MI300X accelerators feature an industry-leading 192 GB of HBM3 memory capacity and a peak memory bandwidth of 5.3 TB/s. The AMD Instinct Platform, built on an industry-standard OCP design with eight MI300X accelerators, offers an unparalleled 1.5TB of HBM3 memory capacity. This standardized design enables OEM partners to seamlessly integrate MI300X accelerators into existing AI offerings, streamlining deployment and accelerating the adoption of AMD Instinct accelerator-based servers.
Compared to the Nvidia H100 HGX, the AMD Instinct Platform exhibits up to a 1.6x throughput increase when running inference on large language models like BLOOM 176B. Notably, it stands as the sole option capable of running inference for a 70B parameter model, such as Llama2, on a single MI300X accelerator, simplifying enterprise-class large language model deployments and offering outstanding total cost of ownership.
AMD Instinct MI300A: Pioneering Data Center APU for HPC & AI
Introducing the world’s first data center APU for high-performance computing and AI, the AMD Instinct MI300A APUs leverage 3D packaging and the 4th Gen AMD Infinity Architecture. Combining high-performance AMD CDNA 3 GPU cores, the latest AMD “Zen 4” x86-based CPU cores, and 128GB of next-generation HBM3 memory, the MI300A APUs deliver approximately 1.9x the performance-per-watt on FP32 HPC and AI workloads compared to the previous generation AMD Instinct MI250X.
In the quest for energy efficiency, crucial for the HPC and AI communities, AMD Instinct MI300A APUs integrate CPU and GPU cores on a single package, offering a highly efficient platform. This unique advantage ensures unified memory and cache resources, providing customers with a programmable GPU platform, efficient compute performance, rapid AI training, and impressive energy efficiency to meet the demands of the most challenging HPC and AI workloads.
ROCm Software & Ecosystem Advancements
AMD has not only introduced the latest AMD ROCm 6 open software platform but also affirmed its commitment to contributing state-of-the-art libraries to the open-source community. ROCm 6 software marks a significant leap forward, enhancing AI acceleration performance by approximately 8x when running on MI300 Series accelerators in Llama 2 text generation compared to previous hardware and software iterations.
ROCm 6 also introduces support for several new key features for generative AI, including FlashAttention, HIPGraph, and vLLM, among others. AMD’s dedication to open-source AI software development positions the company uniquely, allowing it to leverage widely used models, algorithms, and frameworks like Hugging Face, PyTorch, TensorFlow, and others. Through strategic acquisitions and partnerships, including Nod.AI, Mipsology, Lamini, and MosaicML, AMD continues to invest in software capabilities, unlocking the true potential of generative AI and simplifying the deployment of AMD AI solutions.