Triton Model Navigator v0.10.0

piotr-bazan-nv released this 24 Jun 12:49

· 60 commits to main since this release

Updates:
- new: inplace nav.Module accepts batching flag which overrides a config setting and precision which allows setting appropriate configuration for TensorRT
- new: Allow to set device when loading optimized modules using nav.load_optimized()
- new: Add support for custom i/o names and dynamic shapes in Torch ONNX Dynamo path
- new: Added nav.bundle.save and nav.bundle.load to save and load optimized models from cache
- change: Improved optimize and profile status in inplace mode
- change: Improved handling defaults for ONNX Dynamo when executing nav.package.optimize
- fix: Maintaining modules device in nav.profile()
- fix: Add support for all precisions for TensorRT in nav.profile()
- fix: Forward method not passed to other inplace modules.
Version of external components used during testing:
- PyTorch 2.4.0a0+07cecf4
- TensorFlow 2.15.0
- TensorRT 10.0.1.6
- Torch-TensorRT 2.4.0.a0
- ONNX Runtime 1.18.0
- Polygraphy: 0.49.10
- GraphSurgeon: 0.5.2
- tf2onnx v1.16.1
- Other component versions depend on the used framework containers versions.
  See its support matrix
  for a detailed summary.

Assets 3