AI Inference
Inference is the process by which AI infers information from data. But before this can happen, AI must be trained with a dataset that has been processed for use in AI models. During this training process, the AI model or machine learning algorithm is taught how to interpret and learn from data. Inference is the process of using a trained model to make a prediction.
Numenta* demonstrates the capacity to dramatically reduce the overall cost of running language models in production on Intel® processors, unlocking entirely new natural language processing (NLP) capabilities for customers.
Real-World Use Cases
Inference is used in a wide variety of real-world AI applications. Companies that integrate inference into their use cases see a broad range of benefits, from quickly analyzing large amounts of data to uncovering overlooked insights.
This company uses AI inference to help healthcare professionals increase efficiency, improve consistency, and focus on patient care.
Virtual meeting enhancements in Zoom*, such as virtual background images and background noise detection, rely on AI inference and deep learning.
The Netflix* performance engineering team reduces cloud infrastructure costs by using AI inference to increase efficiency in its streaming environment.
How AI Inference Works
At its essence, AI is the process of converting raw data into actionable insights. The typical AI workflow occurs in three stages: data engineering, AI training, and inference. Each stage has different memory, compute, and latency requirements.
Data engineering has high memory requirements, so large datasets can be efficiently preprocessed to shorten the time required to sort, filter, label, and transform the data.
AI training is usually the most computationally intense stage of the workflow. Based on the size of the dataset, this process can take several hours or even days to complete.
The inference stage has stringent latency requirements, often requiring milliseconds or faster processing speeds.
End-to-End Solution for Language Identification
Language identification is the process of identifying the primary language from multiple audio input samples. In NLP, language identification is applied to tasks such as entering text on your phone, finding news articles you enjoy, or answering questions. The AI must conduct language identification to decide which AI model to invoke to perform a specific task.
Step 1: Data Engineering
The Common Voice* dataset is used for this example. Japanese and Swedish are the default languages, although others can be included. After downloading the Common Voice dataset, the data is preprocessed by converting the MP3 files into WAV format to avoid information loss.
Step 2: AI Training
The Emphasized Channel Attention, Propagation, and Aggregation Time Delay Neural Network (ECAPA-TDNN) model is trained using the Common Voice dataset. This model is implemented using the Hugging Face* SpeechBrain library.
Step 3: Inference
Run inference using the testing set from Common Voice or custom data in a WAV format. Audio segments are passed into the ECAPA-TDNN model to predict the language—the top two languages are chosen as the final results. The inference model is optimized using Intel® Extension for PyTorch* and Intel® Neural Compressor.
Inferencing on Intel® CPUs
Intel® AI Software Portfolio
Intel offers an end-to-end AI software portfolio for use cases across computer vision, NLP, audio, and recommender systems.
AI Frameworks
All major frameworks for deep learning and classical machine learning are optimized using oneAPI libraries that provide optimal performance across Intel CPUs and XPUs. These software optimizations from Intel help deliver orders of magnitude performance gains over stock implementations of the same frameworks.
AI Tools, Libraries & Framework Optimizations
Intel provides a comprehensive portfolio of tools for all your AI needs including data preparation, training, inference, deployment, and scaling. All tools are built on the foundation of a standards-based, unified oneAPI programming model with interoperability, openness, and extensibility as core tenets.
An AI Platform for Developers
Intel is empowering developers to run end-to-end AI pipelines with Intel Xeon Scalable processors. From data preprocessing and modeling to production, Intel has the software, compute platforms, and solution partnerships you need to accelerate the integration of AI everywhere.
Recommended Resources
AI Reference Kits
Downloadable AI reference kits built with Intel AI application tools help data scientists and developers to accelerate AI workflow development.
Intel® Developer Catalog
Find software and tools to develop and deploy solutions optimized for Intel® architecture.
Performance Data for Intel AI Data Center Products
Get the latest AI benchmark performance data for Intel Data Center products, including detailed hardware and software configurations.