Skip to content

Commit

Permalink
Update readme based on feedback and review for deepsparse enterprise (#…
Browse files Browse the repository at this point in the history
…723) (#726)

* Update readme based on feedback and review for deepsparse enterprise

* fixes from discussion on deepsparse vs deepsparse-ent and where those docs will live. New destination is the docs.neuralmagic.com website

* add proper css and html for readme headers and chips

* update links in readme to deepsparse-ent for docs

* update license messaging

* update license messaging

Co-authored-by: Mark Kurtz <[email protected]>
  • Loading branch information
rahul-tuli and markurtz authored Oct 27, 2022
1 parent 873594b commit f583c0b
Showing 1 changed file with 51 additions and 48 deletions.
99 changes: 51 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,74 +15,88 @@ limitations under the License.
-->


<div align="center">
<h1><img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" />&nbsp;&nbsp;DeepSparse Engine</h1>
<p>
<b>
<h3> Sparsity-aware neural network inference engine for GPU-class performance on CPUs </h3>
</b>
</p>

<p>
<div style="display: flex; flex-direction: column; align-items: center;">
<h1>
<img alt="tool icon" src="https://raw.githubusercontent.com/neuralmagic/deepsparse/main/docs/source/icon-deepsparse.png" />
&nbsp;&nbsp;DeepSparse Engine
</h1>
<h3> Sparsity-aware neural network inference engine for GPU-class performance on CPUs </h3>
<div style="display: flex; align-items: center; justify-content: center; flex-wrap: wrap">
<a href="https://docs.neuralmagic.com/deepsparse/">
<img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height=25>
<img alt="Documentation" src="https://img.shields.io/badge/documentation-darkred?&style=for-the-badge&logo=read-the-docs" height="25" />
</a>
<a href="https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ/">
<img src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height=25>
<img alt="Slack" src="https://img.shields.io/badge/slack-purple?style=for-the-badge&logo=slack" height="25" />
</a>
<a href="https://github.com/neuralmagic/deepsparse/issues/">
<img src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height=25>
<img alt="Support" src="https://img.shields.io/badge/support%20forums-navy?style=for-the-badge&logo=github" height="25" />
</a>
<a href="https://github.com/neuralmagic/deepsparse/actions/workflows/quality-check.yaml">
<img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height=25>
<img alt="Main" src="https://img.shields.io/github/workflow/status/neuralmagic/deepsparse/Quality%20Checks/main?label=build&style=for-the-badge" height="25" />
</a>
<a href="https://github.com/neuralmagic/deepsparse/releases">
<img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height=25>
<img alt="GitHub release" src="https://img.shields.io/github/release/neuralmagic/deepsparse.svg?style=for-the-badge" height="25" />
</a>
<a href="https://github.com/neuralmagic/deepsparse/blob/main/CODE_OF_CONDUCT.md">
<img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height=25>
<img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg?color=yellow&style=for-the-badge" height="25" />
</a>
<a href="https://www.youtube.com/channel/UCo8dO_WMGYbWCRnj_Dxr4EA">
<img src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height=25>
<img alt="YouTube" src="https://img.shields.io/badge/-YouTube-red?&style=for-the-badge&logo=youtube&logoColor=white" height="25" />
</a>
<a href="https://medium.com/limitlessai">
<img src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height=25>
<a href="https://medium.com/limitlessai">
<img alt="Medium" src="https://img.shields.io/badge/medium-%2312100E.svg?&style=for-the-badge&logo=medium&logoColor=white" height="25" />
</a>
<a href="https://twitter.com/neuralmagic">
<img src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height=25>
<img alt="Twitter" src="https://img.shields.io/twitter/follow/neuralmagic?color=darkgreen&label=Follow&style=social" height="25" />
</a>
</p>

<br>
</div>
</div>

A CPU runtime that takes advantage of sparsity within neural networks to reduce compute. Read more about sparsification [here](https://docs.neuralmagic.com/main/source/getstarted.html#sparsification).

Neural Magic's DeepSparse Engine is able to integrate into popular deep learning libraries (e.g., Hugging Face, Ultralytics) allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Support includes [PyTorch,](https://pytorch.org/docs/stable/onnx.html) [TensorFlow,](https://github.com/onnx/tensorflow-onnx) [Keras,](https://github.com/onnx/keras-onnx) and [many other frameworks](https://github.com/onnx/onnxmltools).
Neural Magic's DeepSparse Engine is able to integrate into popular deep learning libraries (e.g., Hugging Face, Ultralytics) allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX.
ONNX gives the flexibility to serve your model in a framework-agnostic environment.
Support includes [PyTorch,](https://pytorch.org/docs/stable/onnx.html) [TensorFlow,](https://github.com/onnx/tensorflow-onnx) [Keras,](https://github.com/onnx/keras-onnx) and [many other frameworks](https://github.com/onnx/onnxmltools).

The DeepSparse Engine is available in two editions:
1. [**The Community Edition**](#installation) is open-source and free for evaluation, research, and non-production use with our [Engine Community License](https://neuralmagic.com/legal/engine-license-agreement/).
2. [**The Enterprise Edition**](https://docs.neuralmagic.com/products/deepsparse-ent) requires a Trial License or [can be fully licensed](https://neuralmagic.com/legal/master-software-license-and-service-agreement/) for production, commercial applications.

## Features

- 🔌 [DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server)
- 📜 [DeepSparse Benchmark](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark)
- 👩‍💻 [NLP and Computer Vision Tasks Supported](https://github.com/neuralmagic/deepsparse/tree/main/examples)
- 🧰 [CPU Hardware Support for Various Architectures](https://docs.neuralmagic.com/deepsparse/source/hardware.html)

## 🧰 Hardware Support and System Requirements

Review [CPU Hardware Support for Various Architectures](https://docs.neuralmagic.com/deepsparse/source/hardware.html) to understand system requirements.
The DeepSparse Engine works natively on Linux; Mac and Windows require running Linux in a Docker or virtual machine; it will not run natively on those operating systems.

The DeepSparse Engine is tested on Python 3.7-3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant.
Using a [virtual environment](https://docs.python.org/3/library/venv.html) is highly recommended.

## Installation
The DeepSparse Engine is tested on Python 3.7-3.10, ONNX 1.5.0-1.12.0, ONNX opset version 11+, and manylinux compliant. Using a [virtual environment](https://docs.python.org/3/library/venv.html) is highly recommended. Install the engine using the following command:

Install the DeepSparse Community Edition as follows:

```bash
pip install deepsparse
```

## 🔌 DeepSparse Server
To trial or inquire about licensing for DeepSparse Enterprise Edition, see the [DeepSparse Enterprise documentation](https://docs.neuralmagic.com/products/deepsparse-ent).

## Features

### 🔌 DeepSparse Server

The DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command:

```bash
pip install deepsparse[server]
```

### Single Model
#### Single Model

Once installed, the following example CLI command is available for running inference with a single BERT model:

Expand All @@ -94,7 +108,7 @@ deepsparse.server \

To look up arguments run: `deepsparse.server --help`.

### Multiple Models
#### Multiple Models
To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task:

```yaml
Expand All @@ -118,7 +132,7 @@ deepsparse.server config config.yaml

[Getting Started with the DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info.

## 📜 DeepSparse Benchmark
### 📜 DeepSparse Benchmark

The benchmark tool is available on your CLI to run expressive model benchmarks on the DeepSparse Engine with minimal parameters.

Expand All @@ -138,7 +152,7 @@ deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES]
- Asynchronous (Multi-stream) Scenario


## 👩‍💻 NLP Inference Example
### 👩‍💻 NLP Inference Example

```python
from deepsparse import Pipeline
Expand All @@ -164,7 +178,7 @@ Tasks Supported:
- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/)
- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/)

## 🦉 SparseZoo ONNX vs. Custom ONNX Models
### 🦉 SparseZoo ONNX vs. Custom ONNX Models

DeepSparse can accept ONNX models from two sources:

Expand Down Expand Up @@ -194,7 +208,7 @@ outputs = engine.run(inputs)

The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models.

## Scheduling Single-Stream, Multi-Stream, and Elastic Inference
### Scheduling Single-Stream, Multi-Stream, and Elastic Inference

The DeepSparse Engine offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md).

Expand All @@ -214,22 +228,7 @@ PRO TIP: The most common use cases for the multi-stream scheduler are where para

Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache.

## 🧰 CPU Hardware Support

With support for AVX2, AVX-512, and VNNI instruction sets, the DeepSparse Engine is validated to work on x86 Intel (Haswell generation and later) and AMD CPUs running Linux. Mac and Windows require running Linux in a Docker or virtual machine.

Here is a table detailing specific support for some algorithms over different microarchitectures:

| x86 Extension | Microarchitectures | Activation Sparsity | Kernel Sparsity | Sparse Quantization |
|:------------------:|:-----------------------------------:|:-------------------:|:---------------:|:-------------------:|
| [AMD AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2) | [Zen 2,](https://en.wikipedia.org/wiki/Zen_2) [Zen 3](https://en.wikipedia.org/wiki/Zen_3) | not supported | optimized | emulated |
| [Intel AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2) | [Haswell,](<https://en.wikipedia.org/wiki/Haswell_(microarchitecture)>) [Broadwell,](<https://en.wikipedia.org/wiki/Broadwell_(microarchitecture)>) and newer | not supported | optimized | emulated |
| [Intel AVX-512](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512) | [Skylake,](<https://en.wikipedia.org/wiki/Skylake_(microarchitecture)>) [Cannon Lake,](<https://en.wikipedia.org/wiki/Cannon_Lake_(microarchitecture)>) and newer | optimized | optimized | emulated |
| [Intel AVX-512](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512) VNNI (DL Boost) | [Cascade Lake,](<https://en.wikipedia.org/wiki/Cascade_Lake_(microarchitecture)>) [Ice Lake,](<https://en.wikipedia.org/wiki/Ice_Lake_(microprocessor)>) [Cooper Lake,](<https://en.wikipedia.org/wiki/Cooper_Lake_(microarchitecture)>) [Tiger Lake](<https://en.wikipedia.org/wiki/Tiger_Lake_(microprocessor)>) | optimized | optimized | optimized |

## Resources


#### Libraries
- [DeepSparse](https://docs.neuralmagic.com/deepsparse/)

Expand Down Expand Up @@ -267,7 +266,11 @@ For more general questions about Neural Magic, complete this [form.](http://neur

### License

The project's binary containing the DeepSparse Engine is licensed under the [Neural Magic Engine License.](https://github.com/neuralmagic/deepsparse/blob/main/LICENSE-NEURALMAGIC) Example files and scripts included in this repository are licensed under the [Apache License Version 2.0](https://github.com/neuralmagic/deepsparse/blob/main/LICENSE) as noted.
The Community Edition of the project's binary containing the DeepSparse Engine is licensed under the [Neural Magic Engine License.](https://github.com/neuralmagic/deepsparse/blob/main/LICENSE-NEURALMAGIC)
Example files and scripts included in this repository are licensed under the [Apache License Version 2.0](https://github.com/neuralmagic/deepsparse/blob/main/LICENSE) as noted.

[The Enterprise Edition](https://docs.neuralmagic.com/products/deepsparse-ent) requires a Trial License or [can be fully licensed](https://neuralmagic.com/legal/master-software-license-and-service-agreement/) for production, commercial applications.

### Cite

Find this project useful in your research or other communications? Please consider citing:
Expand Down

0 comments on commit f583c0b

Please sign in to comment.