Intel® Distribution of OpenVINO™ Toolkit

753640
11/20/2024

Introduction

This package contains the Intel® Distribution of OpenVINO™ Toolkit software version 2024.5 for Linux*, Windows* and macOS*.

Available Downloads

  • Debian Linux*
  • Size: 29.8 MB
  • SHA256: 6FBFF98E228D400609B50B0E8EE805B3FFBF0A2675DAC85D51F1ADC35F0F54F3
  • CentOS 7 (1908)*
  • Size: 52.2 MB
  • SHA256: 0986EED55951D7AE8ECFA300F5BFEFD4374087C3AA1E3523F45906FB3E69227F
  • Red Hat Enterprise Linux 8*
  • Size: 57 MB
  • SHA256: F0638B10DD063BA1EC00A9A61F1D80962567C9BACEAEB0142300BCC34F6F62B2
  • Ubuntu 20.04 LTS*
  • Size: 32.9 MB
  • SHA256: C96EE2B4B50ACE80DC71D171D3CFD188EE9686D2413778F73FC86C6340C5D0C9
  • Ubuntu 20.04 LTS*
  • Size: 48.8 MB
  • SHA256: 2EAE0638B595F844FB72903A1B42A2124C5D9645858FFA9B9B15C60E2F97C633
  • Ubuntu 22.04 LTS*
  • Size: 51.3 MB
  • SHA256: F597E56E405A03F67869985FB0B85D5A4E14C219AA8458DD2AD3017C022EA373
  • Size: 52.5 MB
  • SHA256: B602AE818064E4BB909B07BAC508A9B7C6A5DA1035D5D3899D9D99C5EABCBDE2
  • macOS*
  • Size: 138.7 MB
  • SHA256: AA1920E52D394387EA3ED8F3F817B598A2A13B8FEB9D14A5ED3FE77545896E0B
  • macOS*
  • Size: 33.4 MB
  • SHA256: F4CA2BB87032135359B2D96A6315F6E6DBCDE2ED5633BF2DC02BB20E190FC868
  • Windows 11*, Windows 10*
  • Size: 106.1 MB
  • SHA256: E30C60518B6A3CA5D7F1B4FC56673C5B55CAF1962A34F1B50FB6B8A6436AB0C7

Detailed Description

Summary of major features and improvements

  • More Gen AI coverage and framework integrations to minimize code changes
    • New models supported: Llama* 3.2 (1B & 3B), Gemma* 2 (2B & 9B), and YOLO11*.
    • LLM support on NPU: Llama 3 8B, Llama 2 7B, Mistral-v0.2-7B, Qwen2-7B-Instruct and Phi-3
    • Noteworthy notebooks added: Sam2, Llama3.2, Llama3.2 - Vision*, Wav2Lip*, Whisper*, and LLaVA*
    • Preview: support for Flax*, a high-performance Python* neural network library based on JAX*. Its modular design allows for easy customization and accelerated inference on GPUs.
  • Broader Large Language Model (LLM) support and more model compression techniques.
    • Optimizations for built-in GPUs on Intel® Core™ Ultra Processors (Series 1) and Intel® Arc™ Graphics include KV Cache compression for memory reduction along with improved usability, and model load time optimizations to improve first token latency for LLMs..
    • Dynamic quantization was enabled to improve first token latency for LLMs on built-in Intel® GPUs without impacting accuracy on Intel® Core™ Ultra Processors (Series 1). Second token latency will also improve for large batch inference.
    • A new method to generate synthetic text data is implemented in the Neural Network Compression Framework (NNCF). This will allow LLMs to be compressed more accurately using data-aware methods without datasets. Coming soon: This feature will soon be accessible via Optimum Intel on Hugging Face.
  • More portability and performance to run AI at the edge, in the cloud, or locally.
    • Support for Intel® Xeon® 6 Processors with P-cores (formerly codenamed Granite Rapids) and Intel® Core™ Ultra 200V series processors (formerly codenamed Arrow Lake-S).
    • Preview: GenAI API enables multimodal AI deployment with support for multimodal pipelines for improved contextual awareness, transcription pipelines for easy audio-to-text conversions, and image generation pipelines for streamlined text-to-visual conversions..
    • Speculative decoding feature added to the GenAI API for improved performance and efficient text generation using a small draft model that is periodically corrected by the full-size model.
    • Preview: LoRA adapters are now supported in the GenAI API for developers to quickly and efficiently customize image and text generation models for specialized tasks.
    • The GenAI API now also supports LLMs on NPU allowing developers to specify NPU as the target device, specifically for Whisper* Pipeline (for whisper-base, whisper-medium, and whisper-small) and LLM* Pipeline (for Llama 3 8B, Llama 2 7B, Mistral-v0.2-7B, Qwen2-7B-Instruct and Phi-3 Mini-instruct). Use driver version 32.0.100.3104 or later for best performance.

Support Change and Deprecation Notices

  • Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO™ version supporting them. For more details, refer to the OpenVINO Legacy Features and Components page.
  • Discontinued in 2024.0:
  • Deprecated and to be removed in the future:
    • The macOS x86_64 debug bins will no longer be provided with the OpenVINO toolkit, starting with OpenVINO 2024.5.
    • Python 3.8 is no longer supported, starting with OpenVINO 2024.5.
      • As MxNet doesn’t support Python version higher than 3.8, according to the MxNet PyPI project, it is no longer supported by OpenVINO, either.
    • Discrete Keem Bay support is no longer supported, starting with OpenVINO 2024.5.
    • Support for discrete devices (formerly codenamed Raptor Lake) is no longer available for NPU.

Installation instructions

You can choose how to install OpenVINO™ Runtime according to your operating system:

 

What's included in the download package

  • OpenVINO™ Runtime/Inference Engine for C/C++ and Python APIs

Helpful Links

NOTE: Links open in a new window.

This download is valid for the product(s) listed below.