Oracle and AMD Expand Partnership for Next-Generation AI Solutions

Oracle and AMD Expand Partnership for Next-Generation AI Solutions

Major Extension of Collaboration

Oracle and AMD have announced a significant extension to their long-standing partnership, aimed at enhancing AI capabilities for their customers. Beginning in the third quarter of 2026, Oracle will introduce a publicly accessible AI supercluster that leverages 50,000 AMD Instinct MI450 Series GPUs. This ambitious initiative promises further expansion slated for 2027 and beyond, underlining the commitment of both companies to drive innovation in AI technology.

Building on Prior Achievements

This new announcement follows several years of collaboration, which initially began with the introduction of AMD Instinct MI300X instances on Oracle Cloud Infrastructure (OCI) in 2024. The partnership subsequently saw the rollout of OCI Compute instances featuring AMD Instinct MI355X GPUs, feeding into a broader OCI Supercluster designed to accommodate zettascale workloads.

Advanced Infrastructure Design

The forthcoming AI supercluster will utilize AMD’s “Helios” rack design, which integrates cutting-edge AMD Instinct MI450 GPUs, the upcoming “Venice” EPYC CPUs, and the advanced Pensando networking solution codenamed “Vulcano.” This vertically optimized architectural design aims to deliver exceptional performance, energy efficiency, and scalability, particularly beneficial for extensive AI training and inference tasks.

Statements from Key Executives

Mahesh Thiagarajan, Executive Vice President of Oracle Cloud Infrastructure, stated, “Our customers are building some of the world’s most ambitious AI applications, and that requires robust, scalable, and high-performance infrastructure.” He emphasized that the collaboration enables clients to leverage the latest innovations from AMD while benefiting from OCI’s secure and flexible platform, thus empowering them to innovate confidently.

Forrest Norrod, Executive Vice President and General Manager of AMD’s Data Center Solutions Business Group, remarked, “AMD and Oracle continue to set the pace for AI innovation in the cloud.” He emphasized that the combination of AMD’s GPUs and CPUs with Oracle’s advanced networking capabilities presents customers with valuable new resources for effectively training and deploying next-generation AI solutions.

High-Performance GPU Architecture

The MI450 GPUs are specifically engineered to handle complex workloads, including generative AI and high-performance computing tasks. Each GPU features up to 432 GB of HBM4 memory as well as 20 TB/s of memory bandwidth, enabling the processing of significantly larger models entirely in-memory. The “Helios” rack approach not only accommodates dense liquid-cooled setups but also incorporates UALoE scale-up connectivity and Ethernet-based scale-out networking, optimizing both latency and throughput across GPU clusters.

Enhanced Security and Networking Features

Furthermore, the architecture of the supercluster will make use of advanced head nodes featuring EPYC “Venice” CPUs, which incorporate confidential computing and security features to enhance data protection. AMD’s Pensando Data Processing Units (DPUs) will facilitate converged networking, ensuring line-rate data ingestion and improved overall system performance. The framework allows for high-speed, lossless connectivity through up to three 800 Gbps AI-NICs (Vulcano), aligned with standards such as RoCE and UEC.

Simplified Data Sharing and Resource Allocation

An essential characteristic of this architecture is its implementation of the open UALink and UALoE fabric, which streamlines data sharing among GPUs within racks without necessitating routing through CPUs. The ROCm open-source software stack enhances vendor flexibility while supporting a variety of AI and HPC frameworks. Additionally, innovative partitioning and virtualization features enable secure multi-tenant resource sharing and precise allocation of GPU workloads.

OCI Compute Instances Availability

In conjunction with the deployment of MI450 instances, Oracle has also announced the general availability of OCI Compute instances powered by AMD Instinct MI355X GPUs. These instances will be part of the zettascale OCI Supercluster, capable of scaling up to 131,072 GPUs, thus providing customers with enhanced optimization and versatility for their AI workloads.


Published on 1760662393 • Category: Resources

Oracle and AMD Expand Partnership for Next-Generation AI Solutions

Oracle and AMD Expand Partnership for Next-Generation AI Solutions

Major Extension of Collaboration

Oracle and AMD have announced a significant extension to their long-standing partnership, aimed at enhancing AI capabilities for their customers. Beginning in the third quarter of 2026, Oracle will introduce a publicly accessible AI supercluster that leverages 50,000 AMD Instinct MI450 Series GPUs. This ambitious initiative promises further expansion slated for 2027 and beyond, underlining the commitment of both companies to drive innovation in AI technology.

Building on Prior Achievements

This new announcement follows several years of collaboration, which initially began with the introduction of AMD Instinct MI300X instances on Oracle Cloud Infrastructure (OCI) in 2024. The partnership subsequently saw the rollout of OCI Compute instances featuring AMD Instinct MI355X GPUs, feeding into a broader OCI Supercluster designed to accommodate zettascale workloads.

Advanced Infrastructure Design

The forthcoming AI supercluster will utilize AMD’s “Helios” rack design, which integrates cutting-edge AMD Instinct MI450 GPUs, the upcoming “Venice” EPYC CPUs, and the advanced Pensando networking solution codenamed “Vulcano.” This vertically optimized architectural design aims to deliver exceptional performance, energy efficiency, and scalability, particularly beneficial for extensive AI training and inference tasks.

Statements from Key Executives

Mahesh Thiagarajan, Executive Vice President of Oracle Cloud Infrastructure, stated, “Our customers are building some of the world’s most ambitious AI applications, and that requires robust, scalable, and high-performance infrastructure.” He emphasized that the collaboration enables clients to leverage the latest innovations from AMD while benefiting from OCI’s secure and flexible platform, thus empowering them to innovate confidently.

Forrest Norrod, Executive Vice President and General Manager of AMD’s Data Center Solutions Business Group, remarked, “AMD and Oracle continue to set the pace for AI innovation in the cloud.” He emphasized that the combination of AMD’s GPUs and CPUs with Oracle’s advanced networking capabilities presents customers with valuable new resources for effectively training and deploying next-generation AI solutions.

High-Performance GPU Architecture

The MI450 GPUs are specifically engineered to handle complex workloads, including generative AI and high-performance computing tasks. Each GPU features up to 432 GB of HBM4 memory as well as 20 TB/s of memory bandwidth, enabling the processing of significantly larger models entirely in-memory. The “Helios” rack approach not only accommodates dense liquid-cooled setups but also incorporates UALoE scale-up connectivity and Ethernet-based scale-out networking, optimizing both latency and throughput across GPU clusters.

Enhanced Security and Networking Features

Furthermore, the architecture of the supercluster will make use of advanced head nodes featuring EPYC “Venice” CPUs, which incorporate confidential computing and security features to enhance data protection. AMD’s Pensando Data Processing Units (DPUs) will facilitate converged networking, ensuring line-rate data ingestion and improved overall system performance. The framework allows for high-speed, lossless connectivity through up to three 800 Gbps AI-NICs (Vulcano), aligned with standards such as RoCE and UEC.

Simplified Data Sharing and Resource Allocation

An essential characteristic of this architecture is its implementation of the open UALink and UALoE fabric, which streamlines data sharing among GPUs within racks without necessitating routing through CPUs. The ROCm open-source software stack enhances vendor flexibility while supporting a variety of AI and HPC frameworks. Additionally, innovative partitioning and virtualization features enable secure multi-tenant resource sharing and precise allocation of GPU workloads.

OCI Compute Instances Availability

In conjunction with the deployment of MI450 instances, Oracle has also announced the general availability of OCI Compute instances powered by AMD Instinct MI355X GPUs. These instances will be part of the zettascale OCI Supercluster, capable of scaling up to 131,072 GPUs, thus providing customers with enhanced optimization and versatility for their AI workloads.


Published on 1760662393 • Category: Resources

Latest Posts

Latest Posts

Don't Miss

Subscribe

To be updated with all the latest news, offers and special announcements.