DeepSparse v1.1.0

jeanniefinks released this 25 Aug 19:29

New Features:

Python 3.10 support added.
Zero-shot text classification pipeline implemented.
Haystack Information Retrieval pipeline implemented.
YOLACT pipeline native integration for deployments is available.
DeepSparse pipelines now support dynamic batch, dynamic shape through bucketing, and asynchronous execution support.
CustomTaskPipeline added to enable easier custom pipeline creation.

Changes:

The behavior of the Multi-stream scheduler is now identical to the Elastic scheduler, and the old Multi-stream scheduler has been removed.
NLP pipelines for question answering, text classification, and token classification upgraded to improve accuracy and better match the SparseML training pathways.
Updates made across the repository for new SparseZoo Python APIs.
Max torchvision version increased to 0.12.0 for computer vision deployment pathways.

Performance:

Inference performance improvements for
- unstructured sparse quantized Transformer models.
- slow activation functions (such as Gelu or Swish) when they follow a QuantizeLinear operator.
- some sparse 1D convolutions. Speedups of up to 3x are observed.
- Squeeze, when operating on a single axis.

Resolved Issues:

Assertion errors no longer when one node had multiple inputs, both coming from the same node no longer occurs.
An assertion error no longer appears when a MatMul operator followed a Transpose or Reshape operator no longer occurs.
Pipelines now support hyphenated versions of standard task names such as question-answering,

Known Issues:

In the C++ interface, the engine will crash with a segmentation fault when the num_streams provided to the engine_context_t is greater than the number of physical CPU cores.

Assets 9