TensorRT model not loading #6453
-
Hi! I'm starting to play with the Triton Server using the models trained in the TAO sdk, but for some reason, I can't load the TensorRT models. I tried with 23.07 and with 23.03 docker images of the Tensor Server but with same results (I tried to match the TensorRT version on both parts). The error that gives me is: "UNAVAILABLE: Internal: unable to create TensorRT engine" I tried with the engine files from the Yolov4 and the SSD examples (And on the TAO inferencer they seem to be working fine), with FP16 in both cases. This is the content of config.pbtxt : name: "TensorRT_Test1" And the model has been renamed to model.plan. Is something that I am missing here? Does the last version of TensorRT work with models trained with older versions? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Quoting TensorRT documentation: "By default, TensorRT engines are compatible only with the version of TensorRT with which they are built." 23.09 added version-compatibility support. To use this feature with a 23.09 or later container, you need to generate the models to be version-compatible and pass in the version-compatible backend flag to Triton. Examples of how to do this exist in the related server pull request. You would need to look at the rest of your logs (verbose logging could be helpful here) to confirm the issue is a version mismatch. If so, the most surefire way to fix this is to generate the TensorRT model inside of the NGC TensorRT container of the same version as Triton (e.g. 23.07). The other option is the above, which enables version-compatibility at a slight cost. |
Beta Was this translation helpful? Give feedback.
Quoting TensorRT documentation: "By default, TensorRT engines are compatible only with the version of TensorRT with which they are built."
23.09 added version-compatibility support. To use this feature with a 23.09 or later container, you need to generate the models to be version-compatible and pass in the version-compatible backend flag to Triton. Examples of how to do this exist in the related server pull request.
You would need to look at the rest of your logs (verbose logging could be helpful here) to confirm the issue is a version mismatch. If so, the most surefire way to fix this is to generate the TensorRT model inside of the NGC TensorRT container of the same version as Triton (e.…