Skip to content

DeepSparse v1.1.0

Compare
Choose a tag to compare
@jeanniefinks jeanniefinks released this 25 Aug 19:29
2f89568

New Features:

Changes:

  • The behavior of the Multi-stream scheduler is now identical to the Elastic scheduler, and the old Multi-stream scheduler has been removed.
  • NLP pipelines for question answering, text classification, and token classification upgraded to improve accuracy and better match the SparseML training pathways.
  • Updates made across the repository for new SparseZoo Python APIs.
  • Max torchvision version increased to 0.12.0 for computer vision deployment pathways.

Performance:

  • Inference performance improvements for
    • unstructured sparse quantized Transformer models.
    • slow activation functions (such as Gelu or Swish) when they follow a QuantizeLinear operator.
    • some sparse 1D convolutions. Speedups of up to 3x are observed.
    • Squeeze, when operating on a single axis.

Resolved Issues:

  • Assertion errors no longer when one node had multiple inputs, both coming from the same node no longer occurs.
  • An assertion error no longer appears when a MatMul operator followed a Transpose or Reshape operator no longer occurs.
  • Pipelines now support hyphenated versions of standard task names such as question-answering,

Known Issues:

  • In the C++ interface, the engine will crash with a segmentation fault when the num_streams provided to the engine_context_t is greater than the number of physical CPU cores.