Implementing Logit Functionality in vLLM #1587
-
Hello, I have been working with the vLLM project and I am interested in implementing a feature to make inference faster. Specifically, I want to generate probabilities for the output, similar to a discussion I found on Hugging Face (you can find the discussion here). However, I noticed that the vLLM model doesn’t provide logits for the previous token of the prompts, it only gives the generated token logits. Here is the specific part of the code I am referring to
I am seeking guidance on how I could modify or utilize the vLLM source code to get this functionality. Any help or direction would be greatly appreciated |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
The |
Beta Was this translation helpful? Give feedback.
The
prompt_logprobs
parameters in SamplingParams might be helpful here: https://github.com/vllm-project/vllm/blob/1a2bbc930135cd3b94fbff2aafbdf5c568acc8bd/vllm/sampling_params.py#L79C9-L79C24