Generative AI Inference Operations with Cisco UCS / 5th Gen and 4th Gen Intel Xeon Processors
Intel® QuickAssist Technology (Intel® QAT) Intel® QuickAssist Technology (Intel® QAT) HPC Library Intel® Xeon® Scalable Processors Intel® Xeon® Scalable Processors
812502
2023-12-29
Public
Preview is not available for this file. Please download the file.
Description
Cisco UCS, powered by 5th Gen Intel® Xeon® processors and Cisco Nexus, is a scalable foundation for deploying Generative AI at scale. This architecture delivers:
- Optimal performance: Cisco UCS with Intel Xeon Scalable processors with specialized AI accelerators and optimized software frameworks significantly improves inferencing performance and scalability.
- Balanced architecture: Cisco UCS excels in both Deep Learning and non-Deep Learning compute, critical for the entire inference pipeline. This balanced approach leads to better overall performance and resource utilization.
- Scalability on demand: Cisco UCS seamlessly scales with your Generative AI inferencing needs. Add or remove servers, adjust memory capacities, and configure resources in an automated manner as your models evolve and workloads grow using Cisco Intersight.
You have the option to run inferencing in the data center or at the edge, using either a modular or a rack form factor.
Usage instructions
Related Assets
Title and Description
Format
Language
Action
Microsoft SQL Server 2022 on Cisco UCS X210c M6/M7 on 4th Gen Intel® Xeon® Scalable Processors — White Paper
This white paper contains a reference architecture that illustrates the benefits of Microsoft SQL Server 2022 on Cisco UCS X210c M6/M7 on 4th Gen Intel® Xeon® Scalable Processors for bare-metal and hybrid cloud deployments.
Cisco UCS M7 and Pure Storage FlashArray: FlashStack VSI with VMware vSphere 8.0 — Design Guide
Cisco 7th generation of UCS C-Series and UCS X-Series Servers, powered using 4th Gen Intel Xeon Scalable processors., and Pure Storage FlashArray FlashStack on VMware vSphere 8 solution.
Cisco UCS M7 IMM FlexPod Datacenter with VMware vSphere 8.0, and NetApp ONTAP 9.12 Powered by Intel — Design Guide
Cisco UCS M7 IMM FlexPod Datacenter with VMware vSphere 8.0, and NetApp ONTAP 9.12 powered by Intel design guide
FlashStack Cisco UCS X-Series and Pure Storage FlashArray//X R3 for VMware Horizon 8 — Design Guide
FlashStack Virtual Desktop Infrastructure for VMware Horizon 8 VMware vSphere 8.0 U1 and 4th Gen Intel® Xeon® Scalable processors Design Guide
Cisco UCS with 5th Gen and 4th Gen Intel Xeon Processors for Generative AI
Cisco UCS, powered by 5th Gen Intel® Xeon® processors, delivers a compelling solution for maximizing Generative AI performance.
Generative AI Inferencing with Cisco UCS X-Series M7 Blade Servers / 5th Gen Intel Xeon Processors
Cisco UCS® with Intel® Xeon® Scalable processors and Cisco Nexus® offers a compelling and scalable foundation for deploying generative AI at scale.
GenAI Inferencing Powered by Cisco UCS X-Series / 5th Gen Intel Xeon Processors on Red Hat OpenShift AI — Cisco Validated Design
Cisco, Red Hat, and Intel provide a proven AI infrastructure to enable VMware-based Red Hat® OpenShift® AI.