Skip to content

Latest commit

 

History

History
131 lines (92 loc) · 6.5 KB

CHANGELOG.md

File metadata and controls

131 lines (92 loc) · 6.5 KB

Changelog

Unreleased

  • Fixed handling TritonConfig.cache_directory option - the directory was always overwritten with the default value.
  • Fixed tritonclient dependency - PyTriton need tritonclient supporting http headers and parameters

0.2.0 (2023-05-30)

  • Added support for using custom HTTP/gRPC request headers and parameters.

    This change breaks backward compatibility of the inference function signature. The undecorated inference function now accepts a list of Request instances instead of a list of dictionaries. The Request class contains data for inputs and parameters for combined parameters and headers.

    See docs/custom_params.md for further information

  • Added FuturesModelClient which enables sending inference requests in a parallel manner.

  • Added displaying documentation link after models are loaded.

0.1.5 (2023-05-12)

  • Improved pytriton.decorators.group_by_values function
    • Modified the function to avoid calling the inference callable on each individual sample when grouping by string/bytes input
    • Added pad_fn argument for easy padding and combining of the inference results
  • Fixed Triton binaries search
  • Improved Workspace management (remove workspace on shutdown)
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.4 (2023-03-16)

  • Add validation of the model name passed to Triton bind method.
  • Add monkey patching of InferenceServerClient.__del__ method to prevent unhandled exceptions.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.3 (2023-02-20)

  • Fixed getting model config in fill_optionals decorator.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.2 (2023-02-14)

  • Fixed wheel build to support installations on operating systems with glibc version 2.31 or higher.
  • Updated the documentation on custom builds of the package.
  • Change: TritonContext instance is shared across bound models and contains model_configs dictionary.
  • Fixed support of binding multiple models that uses methods of the same class.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.1 (2023-01-31)

  • Change: The @first_value decorator has been updated with new features:
    • Renamed from @first_values to @first_value
    • Added a strict flag to toggle the checking of equality of values on a single selected input of the request. Default is True
    • Added a squeeze_single_values flag to toggle the squeezing of single value ND arrays to scalars. Default is True
  • Fix: @fill_optionals now supports non-batching models
  • Fix: @first_value fixed to work with optional inputs
  • Fix: @group_by_values fixed to work with string inputs
  • Fix: @group_by_values fixed to work per sample-wise
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.0 (2023-01-12)

  • Initial release of PyTriton
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.