Skip to content

How to Truncate the input prompt? #2443

Closed Answered by chenxu2048
AamodThakur asked this question in Q&A
Discussion options

You must be logged in to vote

Thank you for your correction! llm.generate accepts a List[List[int]] type instead of a torch.Tensor.

Also getting similar error at different part of code when if "prompt_token_ids" converted to list.

This code was tested on my environment with the latest main branch, and it should work.

from vllm import LLM

llm = LLM(model="lmsys/vicuna-7b-v1.5", max_model_len=4096, max_num_batched_tokens = 4096)
tokenizer = llm.get_tokenizer()
prompt_token_ids = tokenizer.encode("def main", return_tensors="pt")

# Truncate prompt_token_ids
prompt_token_ids = prompt_token_ids[-4096:]

llm.generate(prompt_token_ids=prompt_token_ids.tolist())

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
5 replies
@AamodThakur
Comment options

@chenxu2048
Comment options

Answer selected by AamodThakur
@AamodThakur
Comment options

@AamodThakur
Comment options

@chenxu2048
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants