Support for Logprobs Output in Triton Inference Server with vLLM Backend #7557
Unanswered
RafaelXokito
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I recently deployed the Triton Inference Server with the vLLM backend. However, we have a use case where we need to obtain the logprobs (logarithmic probabilities) for the generated tokens along with the regular output.
Currently, it appears that Triton with the vLLM backend does not directly support returning logprobs as part of the response.
Is there any existing workaround or upcoming feature that would allow us to retrieve logprobs from the vLLM backend through Triton?
Beta Was this translation helpful? Give feedback.
All reactions