Skip to content

Commit

Permalink
Mass integration for 23.08 release
Browse files Browse the repository at this point in the history
1. update pytorch-quantization to 2.1.3
2. update SD to support torch 2.x
3. update docker container
4. misc fixed in samples

Signed-off-by: Vincent Huang <[email protected]>
  • Loading branch information
ttyio authored and rajeevsrao committed Aug 7, 2023
1 parent a167852 commit 35477bd
Show file tree
Hide file tree
Showing 85 changed files with 560 additions and 314 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ To build the TensorRT-OSS components, you will first need the following software
**System Packages**
* [CUDA](https://developer.nvidia.com/cuda-toolkit)
* Recommended versions:
* cuda-12.0.1 + cuDNN-8.8
* cuda-12.2.0 + cuDNN-8.8
* cuda-11.8.0 + cuDNN-8.8
* [GNU make](https://ftp.gnu.org/gnu/make/) >= v4.1
* [cmake](https://github.com/Kitware/CMake/releases) >= v3.13
Expand Down Expand Up @@ -99,9 +99,9 @@ For Linux platforms, we recommend that you generate a docker container for build
1. #### Generate the TensorRT-OSS build container.
The TensorRT-OSS build container can be generated using the supplied Dockerfiles and build scripts. The build containers are configured for building TensorRT OSS out-of-the-box.
**Example: Ubuntu 20.04 on x86-64 with cuda-12.0 (default)**
**Example: Ubuntu 20.04 on x86-64 with cuda-12.1 (default)**
```bash
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.0
./docker/build.sh --file docker/ubuntu-20.04.Dockerfile --tag tensorrt-ubuntu20.04-cuda12.1
```
**Example: CentOS/RedHat 7 on x86-64 with cuda-11.8**
```bash
Expand All @@ -119,7 +119,7 @@ For Linux platforms, we recommend that you generate a docker container for build
2. #### Launch the TensorRT-OSS build container.
**Example: Ubuntu 20.04 build container**
```bash
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.0 --gpus all
./docker/launch.sh --tag tensorrt-ubuntu20.04-cuda12.1 --gpus all
```
> NOTE:
<br> 1. Use the `--tag` corresponding to build container generated in Step 1.
Expand All @@ -130,7 +130,7 @@ For Linux platforms, we recommend that you generate a docker container for build
## Building TensorRT-OSS
* Generate Makefiles and build.
**Example: Linux (x86-64) build with default cuda-12.0**
**Example: Linux (x86-64) build with default cuda-12.1**
```bash
cd $TRT_OSSPATH
mkdir -p build && cd build
Expand All @@ -146,7 +146,7 @@ For Linux platforms, we recommend that you generate a docker container for build
export PATH="/opt/rh/devtoolset-8/root/bin:${PATH}"
```
**Example: Linux (aarch64) build with default cuda-12.0**
**Example: Linux (aarch64) build with default cuda-12.1**
```bash
cd $TRT_OSSPATH
mkdir -p build && cd build
Expand Down Expand Up @@ -174,7 +174,7 @@ For Linux platforms, we recommend that you generate a docker container for build
> NOTE: The latest JetPack SDK v5.1 only supports TensorRT 8.5.2.
> NOTE:
<br> 1. The default CUDA version used by CMake is 12.0.1. To override this, for example to 11.8, append `-DCUDA_VERSION=11.8` to the cmake command.
<br> 1. The default CUDA version used by CMake is 11.4.1. To override this, for example to 11.8, append `-DCUDA_VERSION=11.8` to the cmake command.
<br> 2. If samples fail to link on CentOS7, create this symbolic link: `ln -s $TRT_OUT_DIR/libnvinfer_plugin.so $TRT_OUT_DIR/libnvinfer_plugin.so.8`
* Required CMake build arguments are:
- `TRT_LIB_DIR`: Path to the TensorRT installation directory containing libraries.
Expand Down
2 changes: 1 addition & 1 deletion demo/Diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ cd TensorRT
Install nvidia-docker using [these intructions](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker).

```bash
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.02-py3 /bin/bash
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.06-py3 /bin/bash
```

### Install latest TensorRT release
Expand Down
2 changes: 1 addition & 1 deletion demo/Diffusion/demo_img2img.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def parseArgs():
force_export=args.force_onnx_export, force_optimize=args.force_onnx_optimize, \
force_build=args.force_engine_build, \
static_batch=args.build_static_batch, static_shape=not args.build_dynamic_shape, \
enable_refit=args.build_enable_refit, enable_preview=args.build_preview_features, enable_all_tactics=args.build_all_tactics, \
enable_refit=args.build_enable_refit, enable_all_tactics=args.build_all_tactics, \
timing_cache=args.timing_cache, onnx_refit_dir=args.onnx_refit_dir)
demo.loadResources(image_height, image_width, batch_size, args.seed)

Expand Down
2 changes: 1 addition & 1 deletion demo/Diffusion/demo_inpaint.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ def parseArgs():
force_export=args.force_onnx_export, force_optimize=args.force_onnx_optimize, \
force_build=args.force_engine_build, \
static_batch=args.build_static_batch, static_shape=not args.build_dynamic_shape, \
enable_preview=args.build_preview_features, enable_all_tactics=args.build_all_tactics, \
enable_all_tactics=args.build_all_tactics, \
timing_cache=args.timing_cache)
demo.loadResources(image_height, image_width, batch_size, args.seed)

Expand Down
2 changes: 1 addition & 1 deletion demo/Diffusion/demo_txt2img.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def parseArgs():
force_export=args.force_onnx_export, force_optimize=args.force_onnx_optimize, \
force_build=args.force_engine_build, \
static_batch=args.build_static_batch, static_shape=not args.build_dynamic_shape, \
enable_refit=args.build_enable_refit, enable_preview=args.build_preview_features, enable_all_tactics=args.build_all_tactics, \
enable_refit=args.build_enable_refit, enable_all_tactics=args.build_all_tactics, \
timing_cache=args.timing_cache, onnx_refit_dir=args.onnx_refit_dir)
demo.loadResources(image_height, image_width, batch_size, args.seed)

Expand Down
2 changes: 1 addition & 1 deletion demo/Diffusion/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@ onnxruntime==1.14.1
onnx-graphsurgeon==0.3.26
polygraphy==0.47.1
scipy
torch<2.0.0
torch
transformers==4.26.1
4 changes: 0 additions & 4 deletions demo/Diffusion/stable_diffusion_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,6 @@ def loadEngines(
static_batch=False,
static_shape=True,
enable_refit=False,
enable_preview=False,
enable_all_tactics=False,
timing_cache=None,
onnx_refit_dir=None,
Expand Down Expand Up @@ -229,8 +228,6 @@ def loadEngines(
Build engine only for specified opt_image_height & opt_image_width. Default = True.
enable_refit (bool):
Build engines with refit option enabled.
enable_preview (bool):
Enable TensorRT preview features.
enable_all_tactics (bool):
Enable all tactic sources during TensorRT engine builds.
timing_cache (str):
Expand Down Expand Up @@ -304,7 +301,6 @@ def loadEngines(
static_batch=static_batch, static_shape=static_shape
),
enable_refit=enable_refit,
enable_preview=enable_preview,
enable_all_tactics=enable_all_tactics,
timing_cache=timing_cache,
workspace_size=self.max_workspace_size)
Expand Down
7 changes: 1 addition & 6 deletions demo/Diffusion/utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ def map_name(name):
print("Failed to refit!")
exit(0)

def build(self, onnx_path, fp16, input_profile=None, enable_refit=False, enable_preview=False, enable_all_tactics=False, timing_cache=None, workspace_size=0):
def build(self, onnx_path, fp16, input_profile=None, enable_refit=False, enable_all_tactics=False, timing_cache=None, workspace_size=0):
print(f"Building TensorRT engine for {onnx_path}: {self.engine_path}")
p = Profile()
if input_profile:
Expand All @@ -200,10 +200,6 @@ def build(self, onnx_path, fp16, input_profile=None, enable_refit=False, enable_

config_kwargs = {}

config_kwargs['preview_features'] = [trt.PreviewFeature.DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
if enable_preview:
# Faster dynamic shapes made optional since it increases engine build time.
config_kwargs['preview_features'].append(trt.PreviewFeature.FASTER_DYNAMIC_SHAPES_0805)
if workspace_size > 0:
config_kwargs['memory_pool_limits'] = {trt.MemoryPoolType.WORKSPACE: workspace_size}
if not enable_all_tactics:
Expand Down Expand Up @@ -1201,7 +1197,6 @@ def add_arguments(parser):
parser.add_argument('--build-static-batch', action='store_true', help="Build TensorRT engines with fixed batch size.")
parser.add_argument('--build-dynamic-shape', action='store_true', help="Build TensorRT engines with dynamic image shapes.")
parser.add_argument('--build-enable-refit', action='store_true', help="Enable Refit option in TensorRT engines during build.")
parser.add_argument('--build-preview-features', action='store_true', help="Build TensorRT engines with preview features.")
parser.add_argument('--build-all-tactics', action='store_true', help="Build TensorRT engines using all tactic sources.")
parser.add_argument('--timing-cache', default=None, type=str, help="Path to the precached timing measurements to accelerate build.")

Expand Down
4 changes: 2 additions & 2 deletions demo/HuggingFace/GPT2/GPT2ModelConfig.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def add_args(parser: argparse.ArgumentParser) -> None:
network_group.add_argument(
"--num-beams", type=int, default=1, help="Enables beam search during decoding."
)

network_group.add_argument(
"--fp16", action="store_true", help="Enables fp16 TensorRT tactics."
)
Expand Down Expand Up @@ -84,7 +84,7 @@ def add_benchmarking_args(parser: argparse.ArgumentParser) -> None:


class GPT2ModelTRTConfig(NNConfig):
TARGET_MODELS = ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl", "EleutherAI/gpt-j-6B"]
TARGET_MODELS = ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl", "EleutherAI/gpt-j-6b"]
NETWORK_DECODER_SEGMENT_NAME = "gpt2_decoder"
NETWORK_SEGMENTS = [NETWORK_DECODER_SEGMENT_NAME]
NETWORK_FULL_NAME = "full"
Expand Down
8 changes: 6 additions & 2 deletions docker/centos-7.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

ARG CUDA_VERSION=12.0.1
ARG CUDA_VERSION=12.1.1

FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-centos7
LABEL maintainer="NVIDIA CORPORATION"
Expand Down Expand Up @@ -60,7 +60,11 @@ RUN if [ "${CUDA_VERSION}" = "10.2" ] ; then \
libnvinfer-lean-devel-=${v} libnvinfer-vc-plugin8-=${v} libnvinfer-vc-plugin-devel-=${v} \
libnvinfer-headers-devel-=${v} libnvinfer-headers-plugin-devel-=${v}; \
else \
v="${TRT_VERSION}-1.cuda${CUDA_VERSION%.*}" &&\
ver="${CUDA_VERSION%.*}" &&\
if [ "${ver%.*}" = "12" ] ; then \
ver="12.0"; \
fi &&\
v="${TRT_VERSION}-1.cuda${ver}" &&\
yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo &&\
yum -y install libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} \
libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} \
Expand Down
8 changes: 6 additions & 2 deletions docker/ubuntu-20.04-aarch64.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

ARG CUDA_VERSION=12.0.1
ARG CUDA_VERSION=12.2.0

# Multi-arch container support available in non-cudnn containers.
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04
Expand Down Expand Up @@ -69,7 +69,11 @@ RUN apt-get install -y --no-install-recommends \
ln -s /usr/bin/pip3 pip;

# Install TensorRT. This will also pull in CUDNN
RUN v="${TRT_VERSION}-1+cuda${CUDA_VERSION%.*}" &&\
RUN ver="${CUDA_VERSION%.*}" &&\
if [ "${ver%.*}" = "12" ] ; then \
ver="12.0"; \
fi &&\
v="${TRT_VERSION}-1+cuda${ver}" &&\
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/3bf863cc.pub &&\
apt-get update &&\
sudo apt-get -y install libnvinfer8=${v} libnvonnxparsers8=${v} libnvparsers8=${v} libnvinfer-plugin8=${v} \
Expand Down
8 changes: 6 additions & 2 deletions docker/ubuntu-20.04.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

ARG CUDA_VERSION=12.0.1
ARG CUDA_VERSION=12.1.1

FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu20.04
LABEL maintainer="NVIDIA CORPORATION"
Expand Down Expand Up @@ -79,7 +79,11 @@ RUN if [ "${CUDA_VERSION}" = "10.2" ] ; then \
libnvinfer-lean-dev=${v} libnvinfer-vc-plugin8=${v} libnvinfer-vc-plugin-dev=${v} \
libnvinfer-headers-dev=${v} libnvinfer-headers-plugin-dev=${v}; \
else \
v="${TRT_VERSION}-1+cuda${CUDA_VERSION%.*}" &&\
ver="${CUDA_VERSION%.*}" &&\
if [ "${ver%.*}" = "12" ] ; then \
ver="12.0"; \
fi &&\
v="${TRT_VERSION}-1+cuda${ver}" &&\
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub &&\
apt-get update &&\
sudo apt-get -y install libnvinfer8=${v} libnvonnxparsers8=${v} libnvparsers8=${v} libnvinfer-plugin8=${v} \
Expand Down
2 changes: 1 addition & 1 deletion python/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ endfunction()
# -------- CMAKE OPTIONS --------

set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/${TENSORRT_MODULE}/)
set(CPP_STANDARD 11 CACHE STRING "CPP Standard Version")
set(CPP_STANDARD 14 CACHE STRING "CPP Standard Version")
set(CMAKE_CXX_STANDARD ${CPP_STANDARD})

if (NOT MSVC)
Expand Down
28 changes: 20 additions & 8 deletions python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

## Installation

### Set environment variables

Set `TRT_OSSPATH` and `TRT_LIBPATH` environment variables to point to your OSS clone
and TensorRT library location, respectively.

### Download pybind11

Create a directory for external sources and download pybind11 into it.
Expand All @@ -19,12 +24,12 @@ git clone https://github.com/pybind/pybind11.git
1. Get the source code from the official [python sources](https://www.python.org/downloads/source/)
2. Copy the contents of the `Include/` directory into `$EXT_PATH/pythonX.Y/include/` directory.

Example: Python 3.9
Example: Python 3.10
```bash
wget https://www.python.org/ftp/python/3.9.16/Python-3.9.16.tgz
tar -xvf Python-3.9.16.tgz
mkdir -p $EXT_PATH/python3.9
cp -r Python-3.9.16/Include/ $EXT_PATH/python3.9/include
wget https://www.python.org/ftp/python/3.10.11/Python-3.10.11.tgz
tar -xvf Python-3.10.11.tgz
mkdir -p $EXT_PATH/python3.10/include
cp -r Python-3.10.11/Include/* $EXT_PATH/python3.10/include
```

#### Add PyConfig.h
Expand All @@ -36,15 +41,22 @@ cp -r Python-3.9.16/Include/ $EXT_PATH/python3.9/include
3. Unpack the contained `data.tar.xz` with `tar -xvf`
4. Find `pyconfig.h` in the `./usr/include/<platform>/pythonX.Y/` directory and copy it into `$EXT_PATH/pythonX.Y/include/`.

Example: Python 3.10
```bash
wget http://http.us.debian.org/debian/pool/main/p/python3.10/libpython3.10-dev_3.10.12-1_amd64.deb
ar x libpython3.10-dev*.deb
mkdir debian && tar -xf data.tar.xz -C debian
cp debian/usr/include/x86_64-linux-gnu/python3.10/pyconfig.h python3.10/include/
```

### Build Python bindings

Use `build.sh` to generate the installable wheels for intended python version and target architecture.
Use `build.sh` to generate the installable wheels for the intended Python version and target architecture.

Example: for Python 3.9 `x86_64` wheel,
Example: for Python 3.10 `x86_64` wheel,
```bash
cd $TRT_OSSPATH/python
TENSORRT_MODULE=tensorrt PYTHON_MAJOR_VERSION=3 PYTHON_MINOR_VERSION=9 TARGET_ARCHITECTURE=x86_64 ./build.sh
TENSORRT_MODULE=tensorrt PYTHON_MAJOR_VERSION=3 PYTHON_MINOR_VERSION=10 TARGET_ARCHITECTURE=x86_64 ./build.sh
```

### Install the python wheel
Expand Down
2 changes: 1 addition & 1 deletion python/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,10 @@ pushd ${ROOT_PATH}/python/packaging
for dir in $(find . -type d); do mkdir -p ${WHEEL_OUTPUT_DIR}/$dir; done
for file in $(find . -type f); do expand_vars_cp $file ${WHEEL_OUTPUT_DIR}/${file}; done
popd

cp tensorrt/tensorrt.so bindings_wheel/tensorrt/tensorrt.so

pushd ${WHEEL_OUTPUT_DIR}/bindings_wheel

python3 setup.py -q bdist_wheel --python-tag=cp${PYTHON_MAJOR_VERSION}${PYTHON_MINOR_VERSION} --plat-name=linux_${TARGET}

popd
Loading

0 comments on commit 35477bd

Please sign in to comment.