Skip to content

Triton Model Navigator v0.7.5

Compare
Choose a tag to compare
@jkosek jkosek released this 20 Dec 18:37
· 139 commits to main since this release

Updates:

  • new: FP8 precision support for TensorRT
  • new: Support for autocast and inference mode configuration for Torch runners
  • new: Allow to select device for Torch and ONNX runners
  • new: Add support for default_model_filename in Triton model configuration
  • new: Detailed profiling of inference steps (pre- and postprocessing, memcpy and compute)
  • fix: JAX export and TensorRT conversion fails when custom workspace is used
  • fix: Missing max workspace size passed to TensorRT conversion
  • fix: Execution of TensorRT optimize raise error during handling output metadata
  • fix: Limited Polygraphy version to work correctly with onnxruntime-gpu package

Version of external components used during testing: