Dynamic batching and dynamic padding #7652
Unanswered
FlorentRamb
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I'd like to deploy a Transformers model and use the dynamic batching feature. However the inputs shape may vary from one request to another (different sequences length).
Is there a way to perform dynamic padding (pad to the longest sequence length) within the dynamic batch created by Triton? What would be the best approach?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions