Generative AI Inferencing with Cisco UCS X-Series M7 Blade Servers / 5th Gen Intel Xeon Processors
819855
2024-04-02
Public
Preview is not available for this file. Please download the file.
Description
A solution based on Cisco UCS® with Intel® Xeon® Scalable processors and Cisco Nexus® offers a compelling and scalable foundation for deploying generative AI at scale. This architecture offers a combination of:
- Optimal performance: Cisco UCS with Intel Xeon Scalable processors with specialized AI accelerators and optimized software frameworks significantly improves inferencing performance and scalability. Cisco Nexus 9000 Switches provide high bandwidth, low latency, congestion management mechanisms, and telemetry to meet the demanding networking requirements of AI/ML applications.
- Balanced architecture: Cisco UCS excels in both deep-learning and non-deep-learning compute, critical for the entire inference pipeline. This balanced approach leads to better overall performance and resource utilization.
- Scalability on demand: Cisco UCS seamlessly scales with your generative AI inferencing needs. Add or remove servers, adjust memory capacities, and configure resources in an automated manner as your models evolve and workloads grow using Cisco Intersight®.
The Cisco UCS X-Series Modular System, and C240 and C220 rack servers, support 5th Gen Intel Xeon Scalable processors so that you have the option to run inferencing in the data center or at the edge, using either a modular or a rack form factor.
Usage instructions
Related Assets
Title and Description
Format
Language
Action
Microsoft SQL Server 2022 on Cisco UCS X210c M6/M7 on 4th Gen Intel® Xeon® Scalable Processors — White Paper
This white paper contains a reference architecture that illustrates the benefits of Microsoft SQL Server 2022 on Cisco UCS X210c M6/M7 on 4th Gen Intel® Xeon® Scalable Processors for bare-metal and hybrid cloud deployments.
Cisco UCS M7 and Pure Storage FlashArray: FlashStack VSI with VMware vSphere 8.0 — Design Guide
Cisco 7th generation of UCS C-Series and UCS X-Series Servers, powered using 4th Gen Intel Xeon Scalable processors., and Pure Storage FlashArray FlashStack on VMware vSphere 8 solution.
Cisco UCS M7 IMM FlexPod Datacenter with VMware vSphere 8.0, and NetApp ONTAP 9.12 Powered by Intel — Design Guide
Cisco UCS M7 IMM FlexPod Datacenter with VMware vSphere 8.0, and NetApp ONTAP 9.12 powered by Intel design guide
Generative AI Inference Operations with Cisco UCS / 5th Gen and 4th Gen Intel Xeon Processors
Cisco UCS, powered by 5th Gen Intel® Xeon® processors and Cisco Nexus, is a scalable foundation for deploying Generative AI at scale.
GenAI Inferencing Powered by Cisco UCS X-Series / 5th Gen Intel Xeon Processors on Red Hat OpenShift AI — Cisco Validated Design
Cisco, Red Hat, and Intel provide a proven AI infrastructure to enable VMware-based Red Hat® OpenShift® AI.