Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add T5 acceleration support (#58)
* fix: fix ONNX shape fixed dim size * fix: fix ONNX shape fixed dim size * fix: update docker file base * feat: add script to generate T5 model * feat: add support for provided output shape on ORT inference * fix: disable symbolic shape inference (not working with ONNX last version 1.11.0) * fix: improve documentation (TRT supported op link) * fix: add ipython as dependency in Docker image to run Jupyter notebook easily * feat: better support for dynamic axis (T5) * feat: change shape management in TensorRT inference * feat: T5 test script * fix: error message in tensorrt inference function * update Pytorch dependencies * feat: convert manually onnx to fp16 * fix: add generic support of fp16 on trt * fix: refactoring * fix. refactoring * fix: refactoring * feat: apply new fp32 node detector to t5 dec module * fix: refactoring * fix: fix tests, refactoring * fix: update ORT dependencies * feat: add cache toy model + cache notebook * fix: display output correctly * fix: export T5 graph with cache support * fix: end to end conversion process * fix: working inference * fix: text generation works but is slow * fix: add some explanations, clean scripts * fix: clean scripts * fix: script a bit more stable * fix: add graphs * feat: capture pytorch timings * feat: no more output shape guess for ORT * fix: fix tests * fix: fix tests * feat: no memory copy during ORT inference + refactoring * fix: mixed precision with several models * fix: fix link to demo folder * fix: fix code for CPU only execution * fix: fp16 conversion works * fix: refactoring to hide most of the fp16 logic from the lib user eyes. works on t5-base * feat: improve copy less and FP16 transfo * fix: linter * fix: update text and results * fix: extend gitignore * fix: add doc dependency * feat: add log severity setup to ORT loading * feat: update notebook text * feat: script for tensorrt + t5 * feat: script for tensorrt + t5 * feat: start to support ONNX external data * fix: fix linter * fix: better wording in notebook * fix: notebook works on 3b * fix: notebook works on large * fix: fix encoder * feat: update notebook text * feat: notebook text * feat: text * fix: linter * fix: timings * fix: reduce memory footprint * fix: move
- Loading branch information