OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.

What's New in Version 2024.6

The OpenVINO™ toolkit 2024.6 release enhances generative AI (GenAI) accessibility with improved large language model (LLM) performance and expanded model coverage. It also boosts portability and performance for deployment anywhere: at the edge, in the cloud, or locally.

Includes updates for enhanced stability and improved LLM performance.
Support for the latest Intel® Arc™ B-series graphics (formerly code named Battlemage).
Memory optimizations for improved inference time.
Improved LLM performance with GenAI API optimizations.
Noteworthy notebooks added with OpenVINO toolkit: Visual-language assistant with GLM-Edge-V, Local AI, and multimodal understanding and generation with Janus.

Release Notes

View System Requirements

Features from 2024.5

Easier Model Access and Conversion

Product	Details
New Model Support	Support for Llama 3.2 (1B and 3B), Gemma 2 (2B and 9B), and YOLO* v11 LLM support on NPUs: Llama 3 8B, Llama 2 7B, Mistral-v0.2-7B, Qwen2-7B-Instruct, and Phi-3 Mini Instruct Preview: Support for Flax, a high-performance Python* neural network library based on JAX*. Its modular design allows for easy customization and accelerated inference on GPUs.

Generative AI and LLM Enhancements

Expanded model support and accelerated inference.

Feature	Details
New Jupyter* Notebooks	Noteworthy notebooks added: Sam2, Llama3.2, Llama3.2 - Vision, Wav2Lip, Whisper, and Llava.
Intel® GPUs GPU Optimizations	Optimizations for built-in GPUs on Intel® Core™ Ultra Processors (Series 1) and Intel® Arc™ Graphics include KV Cache compression for memory reduction along with improved usability and model load time optimizations to improve first token latency for LLMs. Dynamic quantization was enabled to improve first token latency for LLMs on built-in Intel® GPUs without impacting accuracy on Intel Core Ultra processors (Series 1). Second token latency will also improve for large batch inference.
NNCF Updates	The Neural Network Compression Framework (NNCF) implements a new method for generating synthetic text data. This allows LLMs to be compressed more accurately using data-aware methods without datasets.

More Portability and Performance

Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.

Product	Details
Intel® Hardware Support	Support for Intel® Xeon® 6 processors with P-cores (formerly codenamed Granite Rapids) and Intel Core Ultra 200V processor family (formerly codenamed Arrow Lake-S)
GenAI API Enhancements	Preview: GenAI API enables multimodal AI deployment with support for multimodal pipelines for improved contextual awareness, transcription pipelines for easy audio-to-text conversions, and image generation pipelines for streamlined text-to-visual conversions. A speculative decoding feature was added to the GenAI API for improved performance and efficient text generation using a small draft model that is periodically corrected by the full-size model. Preview: LoRA adapters are now supported in the GenAI API for developers to quickly and efficiently customize image and text generation models for specialized tasks. The GenAI API now also supports LLMs on NPUs, allowing developers to specify NPU as the target device.

Product

Details

Intel® Hardware Support

Support for Intel® Xeon® 6 processors with P-cores (formerly codenamed Granite Rapids) and Intel Core Ultra 200V processor family (formerly codenamed Arrow Lake-S)

GenAI API Enhancements

Preview: GenAI API enables multimodal AI deployment with support for multimodal pipelines for improved contextual awareness, transcription pipelines for easy audio-to-text conversions, and image generation pipelines for streamlined text-to-visual conversions.
A speculative decoding feature was added to the GenAI API for improved performance and efficient text generation using a small draft model that is periodically corrected by the full-size model.
Preview: LoRA adapters are now supported in the GenAI API for developers to quickly and efficiently customize image and text generation models for specialized tasks.
The GenAI API now also supports LLMs on NPUs, allowing developers to specify NPU as the target device.

Sign Up for Exclusive News, Tips & Releases

Be among the first to learn about everything new with the Intel® Distribution of OpenVINO™ toolkit. By signing up, you get early access product updates and releases, exclusive invitations to webinars and events, training and tutorial resources, contest announcements, and other breaking news.

All fields are required unless marked optional.

Business Email Address

Please select a country/region

Describe your use case or market:

Intel strives to provide you with a great, personalized experience, and your data helps us to accomplish this.

I consent to Intel collecting and using my personal information as described below.

I expressly consent to Intel transferring my Personal Data outside of China to Intel U.S. headquarters as described in the Intel Privacy Notice Supplement for Users in China.

By submitting this form, you are confirming you are an adult 18 years or older and you agree to share your personal information with Intel to use for this business request. You also agree to subscribe to stay connected to the latest Intel® technologies and industry trends by email and telephone. You may unsubscribe at any time. Intel's web sites and communications are subject to our Privacy Notice and Terms of Use.

Thank you for registering to stay up-to-date with the latest on AI inference with the OpenVINO™ toolkit.

Resources

Community and Support

Explore ways to get involved and stay up-to-date with the latest announcements.

Get Help

Ask on the Community Forum

Contact Intel Support

File an Issue on GitHub*

Get Answers on StackOverflow

Stay Informed

Read the Documentation

Read the Knowledge Base

Learning

Training and Certifications

Downloadable Resources

Webinars

Get Started

Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.

Free Download

Powered by oneAPI

The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.

Sign Up for Exclusive News, Tips & Releases

All fields are required unless marked optional.

Business Email Address

Please select a country/region

Describe your use case or market:

Intel strives to provide you with a great, personalized experience, and your data helps us to accomplish this.

I consent to Intel collecting and using my personal information as described below.

I expressly consent to Intel transferring my Personal Data outside of China to Intel U.S. headquarters as described in the Intel Privacy Notice Supplement for Users in China.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.

What's New in Version 2024.6

Features from 2024.5

Sign Up for Exclusive News, Tips & Releases

Thank you for registering to stay up-to-date with the latest on AI inference with the OpenVINO™ toolkit.

Resources

Community and Support

Get Started

Powered by oneAPI

Sign Up for Exclusive News, Tips & Releases

Thank you for registering to stay up-to-date with the latest on AI inference with the OpenVINO™ toolkit.