-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llamacpp generation incoherent (always <eos>). Driver version on ubuntu 22.04.5? #12258
Labels
Comments
Hi @ultoris , we didn't reproduce this issue on our native Linux MTL and the llama-3.1 Q8_0 model output is normal:
our env check result:
We notice that you are not using the latest
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I've tried using llamacpp in both docker and native versions using the provided guides:
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/DockerGuides/docker_cpp_xpu_quickstart.md
and cannot get correct model output in either case.
When using docker the generation stops at first token as model outputs
<eos>
no matter the prompt.When using the native version I can get an answer but the model quality is heavily degraded, uses a lot of
*
tokens and gets incoherent after a few hundred tokens.I've tried the latest version of the docker image and pip packages and also used older versions like ipex-llm[cpp]==2.1.0 (pip) that have slight variations but the problem persists.
I'm using Bartowski's GGUF Q8_0 versions of gemma2 (27b) and llama3.1 (8b) models. The models work fine on pure cpp ggerganov/llamacpp.
This is the output of the env-check script:
I suspect the problem is related to the driver version 24.35.30872 which is lower than the 31.0.101.5522 specified on the FAQ of the llama_cpp_quickstart.md guide. I've followed the instructions on https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/install_linux_gpu.md#install-gpu-driver (option 1 for kernel 6.5) and the version that gets installed via apt is 24.35.30872.
The text was updated successfully, but these errors were encountered: