-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ChatQnA queries return just Gaudi TGI errors when rerank is used #487
Comments
@lianhao @yongfengdu Any comments on this? |
Note: TGI outputs this error only after uploading doc with data-prep i.e. when ChatQnA uses reranking (rerank use adds more input tokens for TGI). If I minimize input / use smaller max tokens limit, there's still an error, it just changes a bit:
On quick check on HF TEI docs, it does not seem to have options for limiting tokens, (except for warmup), but the issue goes away if I double current TGI token limits:
|
Doesn't current CI make sure that |
Installing
-f chatqna/gaudi-values.yaml
git HEAD setup with Helm, and then querying ChatQnA:Just gives TGI errors as answers:
Which is indeed how GenAIInfra is configured for Gaudi:
PS. To make things worse:
The text was updated successfully, but these errors were encountered: