Skip to content

2024.2.0

Compare
Choose a tag to compare
@artanokhov artanokhov released this 17 Jun 17:21
· 2219 commits to master since this release
5c0f38f

Summary of major features and improvements  

  • More Gen AI coverage and framework integrations to minimize code changes

    • Llama 3 optimizations for CPUs, built-in GPUs, and discrete GPUs for improved performance and efficient memory usage.
    • Support for Phi-3-mini, a family of AI models that leverages the power of small language models for faster, more accurate and cost-effective text processing.
    • Python Custom Operation is now enabled in OpenVINO making it easier for Python developers to code their custom operations instead of using C++ custom operations (also supported). Python Custom Operation empowers users to implement their own specialized operations into any model.
    • Notebooks expansion to ensure better coverage for new models. Noteworthy notebooks added: DynamiCrafter, YOLOv10, Chatbot notebook with Phi-3, and QWEN2.
  • Broader Large Language Model (LLM) support and more model compression techniques.

    • GPTQ method for 4-bit weight compression added to NNCF for more efficient inference and improved performance of compressed LLMs.
    • Significant LLM performance improvements and reduced latency for both built-in GPUs and discrete GPUs.
    • Significant improvement in 2nd token latency and memory footprint of FP16 weight LLMs on AVX2 (13th Gen Intel® Core™ processors) and AVX512 (3rd Gen Intel® Xeon® Scalable Processors) based CPU platforms, particularly for small batch sizes.
  • More portability and performance to run AI at the edge, in the cloud, or locally.

    • Model Serving Enhancements:
      • Preview: OpenVINO Model Server (OVMS) now supports OpenAI-compatible API along with Continuous Batching and PagedAttention, enabling significantly higher throughput for parallel inferencing, especially on Intel® Xeon® processors, when serving LLMs to many concurrent users.
      • OpenVINO backend for Triton Server now supports built-in GPUs and discrete GPUs, in addition to dynamic shapes support.
      • Integration of TorchServe through torch.compile OpenVINO backend for easy model deployment, provisioning to multiple instances, model versioning, and maintenance.
    • Preview: addition of the Generate API, a simplified API for text generation using large language models with only a few lines of code. The API is available through the newly launched OpenVINO GenAI package.
    • Support for Intel Atom® Processor X Series. For more details, see System Requirements.
    • Preview: Support for Intel® Xeon® 6 processor.

Support Change and Deprecation Notices

  • Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO version supporting them. For more details, refer to the OpenVINO Legacy Features and Components page.
  • Discontinued in 2024.0:
  • Deprecated and to be removed in the future:
    • The OpenVINO™ Development Tools package (pip install openvino-dev) will be removed from installation options and distribution channels beginning with OpenVINO 2025.0.
    • Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using the new conversion methods instead. For more details, see the model conversion transition guide.
    • OpenVINO property Affinity API will be discontinued with OpenVINO 2025.0. It will be replaced with CPU binding configurations (ov::hint::enable_cpu_pinning).
    • OpenVINO Model Server components:
      • “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the future. OpenVINO’s dynamic shape models are recommended instead.
    • A number of notebooks have been deprecated. For an up-to-date listing of available notebooks, refer to the OpenVINO™ Notebook index (openvinotoolkit.github.io).

You can find OpenVINO™ toolkit 2024.2 release here:

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@siddhant-0707
@adismort14
@LucaTamSapienza
@hongbo-wei
@awayzjj
@qxprakash
@keyonjie
@Huanli-Gong
@hegdeadithyak
@inbasperu
@Thodoris1999
@hongbo-wei
@himanshugupta11002
@tranchung163
@SANJITH-KUMAR-20
@anzr299
@Vladislav-Denisov

Release documentation is available here: https://docs.openvino.ai/2024
Release Notes are available here: https://www.intel.com/content/www/us/en/developer/articles/release-notes/openvino/2024-2.html