- Fixed handling
TritonConfig.cache_directory
option - the directory was always overwritten with the default value. - Fixed tritonclient dependency - PyTriton need tritonclient supporting http headers and parameters
- Version of Triton Inference Server embedded in wheel: 2.33.0
-
Added support for using custom HTTP/gRPC request headers and parameters.
This change breaks backward compatibility of the inference function signature. The undecorated inference function now accepts a list of
Request
instances instead of a list of dictionaries. TheRequest
class contains data for inputs and parameters for combined parameters and headers.See docs/custom_params.md for further information
-
Added
FuturesModelClient
which enables sending inference requests in a parallel manner. -
Added displaying documentation link after models are loaded.
- Version of Triton Inference Server embedded in wheel: 2.33.0
- Improved
pytriton.decorators.group_by_values
function- Modified the function to avoid calling the inference callable on each individual sample when grouping by string/bytes input
- Added
pad_fn
argument for easy padding and combining of the inference results
- Fixed Triton binaries search
- Improved Workspace management (remove workspace on shutdown)
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.
- Add validation of the model name passed to Triton bind method.
- Add monkey patching of
InferenceServerClient.__del__
method to prevent unhandled exceptions.
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.
- Fixed getting model config in
fill_optionals
decorator.
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.
- Fixed wheel build to support installations on operating systems with glibc version 2.31 or higher.
- Updated the documentation on custom builds of the package.
- Change: TritonContext instance is shared across bound models and contains model_configs dictionary.
- Fixed support of binding multiple models that uses methods of the same class.
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.
- Change: The
@first_value
decorator has been updated with new features:- Renamed from
@first_values
to@first_value
- Added a
strict
flag to toggle the checking of equality of values on a single selected input of the request. Default is True - Added a
squeeze_single_values
flag to toggle the squeezing of single value ND arrays to scalars. Default is True
- Renamed from
- Fix:
@fill_optionals
now supports non-batching models - Fix:
@first_value
fixed to work with optional inputs - Fix:
@group_by_values
fixed to work with string inputs - Fix:
@group_by_values
fixed to work per sample-wise
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.
- Initial release of PyTriton
- Version of external components used during testing:
- Triton Inference Server: 2.29.0
- Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.