Releases: triton-inference-server/model_navigator
Triton Model Navigator v0.7.4
Updates:
- new: decoupled mode configuration in Triton Model Config
- new: support for PyTorch ExportedProgram and ONNX dynamo export
- new: added GraphSurgeon ONNX optimalization
- fix: compatibility of generating PyTriton model config through adapter
- fix: installation of packages that are platform dependent
- fix: update package config with model loaded from source
- change: in TensorRT runner, when TensorType.TORCH is the return type lazily convert tensor to Torch
- change: move from Polygraphy CLI to Polygraphy Python API
- change: removed Windows from support list
Version of external components used during testing:
- PyTorch 2.1.0a0+ 32f93b1
- TensorFlow 2.13.0
- TensorRT 8.6.1
- ONNX Runtime 1.16.0
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.27
- tf2onnx v1.15.1
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.7.3
Updates:
- new: Data dependent dynamic control flow support in nav.Module (multiple computation graphs per module)
- new: Added find max batch size utility
- new: Added utilities API documentation
- new: Add Timer class for measuring execution time of models and Inplace modules.
- fix: Use wide range of shapes for TensorRT conversion
- fix: Sorting of samples loaded from workspace
- change: in Inplace, store one sample by default per module and store shape info for all samples
- change: always execute export for all supported formats
Known issues and limitations:
- nav.Module moves original torch.nn.Module to the CPU, in case of weight sharing that might result in unexpected behaviour
- For data dependent dynamic control flow (multiple computation graphs) nav.Module might copy the weights for each separate graph
Version of external components used during testing:
- PyTorch 2.1.0a0+29c30b1
- TensorFlow 2.13.0
- TensorRT 8.6.1
- ONNX Runtime 1.15.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.27
- tf2onnx v1.15.1
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.7.2
-
fix: Obtaining inputs names from ONNX file for TensorRT conversion
-
change: Raise exception instead of exit with code when required command has failed
-
Version of external components used during testing:
- PyTorch 2.1.0a0+b5021ba
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.15.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.27
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.7.1
-
fix: gather onnx input names based on model's forward signature
-
fix: do not run TensorRT max batch size search when max batch size is None
-
fix: use pytree metadata to flatten torch complex outputs
-
Version of external components used during testing:
- PyTorch 2.1.0a0+b5021ba
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.15.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.27
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.7.0
-
new: Inplace Optimize feature - optimize models directly in the Python code
-
new: Non-tensor inputs and outputs support
-
new: Model warmup support in Triton model configuration
-
new: nav.tensorrt.optimize api added for testing and measuring performance of TensorRT models
-
new: Extended custom configs to pass arguments directly to export and conversion operations like
torch.onnx.export
orpolygraphy convert
-
new: Collect GPU clock during model profiling
-
new: Add option to configure minimal trials and stabilization windows for performance verification and profiling
-
change: Navigator package version change to 0.2.3. Custom configurations now use trt_profiles list instead single value
-
change: Store separate reproduction scripts for runners used during correctness and profiling
-
Version of external components used during testing:
- PyTorch 2.1.0a0+b5021ba
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.15.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.27
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.6.3
-
fix: Conditional imports of supported frameworks in export commands
-
Version of external components used during testing:
- PyTorch 2.1.0a0+4136153
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.13.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.26
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.6.2
-
new: Collect information about TensorRT shapes used during conversion
-
fix: Invalid link in documentation
-
change: Improved rendering documentation
-
Version of external components used during testing:
- PyTorch 2.1.0a0+4136153
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.13.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.26
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.6.1
-
fix: Add model from package to Triton model store with custom configs
-
Version of external components used during testing:
- PyTorch 2.1.0a0+4136153
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.13.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.26
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.6.0
-
new: Zero-copy runners for Torch, ONNX and TensorRT - omit H2D and D2H memory copy between runners execution
-
new:
nav.pacakge.profile
API method to profile generated models on provided dataloader -
change: ProfilerConfig replaced with OptimizationProfile:
- new: OptimizationProfile impact the conversion for TensorRT
- new:
batch_sizes
andmax_batch_size
limit the max profile in TensorRT conversion - new: Allow to provide separate dataloader for profiling - first sample used only
-
new: allow to run
nav.package.optimize
on empty package - status generation only -
new: use
torch.inference_mode
for inference runner when PyTorch 2.x is available -
fix: Missing
model
in config when passing package generated duringnav.{framework}.optimize
directly tonav.package.optimize
command -
Other minor fixes and improvements
-
Version of external components used during testing:
- PyTorch 2.1.0a0+4136153
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.13.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.26
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.
Triton Model Navigator v0.5.6
-
fix: Load samples as sorted to keep valid order
-
fix: Execute conversion when model already exists in path
-
Other minor fixes and improvements
-
Version of external components used during testing:
- PyTorch 2.1.0a0+fe05266f
- TensorFlow 2.12.0
- TensorRT 8.6.1
- ONNX Runtime 1.13.1
- Polygraphy: 0.47.1
- GraphSurgeon: 0.3.26
- tf2onnx v1.14.0
- Other component versions depend on the used framework containers versions.
See its support matrix
for a detailed summary.