How Prediction Guard Delivers Trustworthy AI on Intel® Gaudi® 2 AI Accelerators
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Large language models (LLM) promise to revolutionize how enterprises operate, but making them production-ready means solving privacy risks, security vulnerabilities, and performance bottlenecks.
Not so easy.
This session focuses on how AI startup Prediction Guard found a solution to these challenges by using the processing power of Intel® Gaudi® 2 AI accelerators in the Intel® Tiber™ Developer Cloud.1 The topics include:
- Prediction Guard’s pioneering work with hosting open source LLMs like Llama 2 and neural-chat-7B in a secure, privacy-preserving environment with filters for PII, prompt-injection attacks, toxic outputs, and factual inconsistencies.
- How Prediction Guard optimized batching, model replication, tensor shaping, and hyperparameters for 2x throughput gains and industry-leading time to first token for streaming.
- Architectural insights and best practices for capitalizing on LLMs.
Skill level: Expert
Featured Software
- This session showcases the Intel Tiber Developer Cloud: Learn More | Sign Up
Download Code Samples
Other Resources
Related Webinar
Product and Performance Information
1Formerly Intel® Developer Cloud