run harness on A770 error #12290

tao-ov · 2024-10-29T08:08:37Z

when i run harness as the following link on A770

https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/dev/benchmark/harness/run_llb.py

the cmd is：python run_llb.py --model ipex-llm --pretrained /home/test/models/LLM/baichuan2-7b/pytorch/ --precision sym_int4 --device xpu --tasks hellaswag --batch 1 --no_cache

it occurs this error：
RuntimeError: Job config of task=hellaswag, precision=sym_int4 failed. Error Message: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte

lalalapotter · 2024-10-30T02:32:06Z

Could you please remove the try-except clause here and provide more error log?

tao-ov · 2024-10-30T03:13:33Z

(llm) test@test-Z590-VISION-D:~/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness$ python run_llb.py --model ipex-llm --pretrained /home/test/models/LLM/baichuan2-7b/pytorch/ --precision sym_int4 --device xpu --tasks hellaswag --batch 1 --no_cache
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-10-30 11:06:38,081 - INFO - intel_extension_for_pytorch auto imported
Selected Tasks: ['hellaswag']
The repository for /home/test/models/LLM/baichuan2-7b/pytorch/ contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//home/test/models/LLM/baichuan2-7b/pytorch/.
You can avoid this prompt in future by passing the argument trust_remote_code=True.

Do you wish to run the custom code? [y/N] y
2024-10-30 11:06:40,365 - WARNING - Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
2024-10-30 11:06:55,197 - INFO - Converting the current model to sym_int4 format......
Traceback (most recent call last):
File "/home/test/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness/run_llb.py", line 147, in
main()
File "/home/test/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness/run_llb.py", line 101, in main
results = evaluator.simple_evaluate(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/utils.py", line 243, in _wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/evaluator.py", line 89, in simple_evaluate
task_dict = lm_eval.tasks.get_task_dict(tasks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/tasks/init.py", line 390, in get_task_dict
task_name_dict = {
^
File "/home/test/lm-evaluation-harness/lm_eval/tasks/init.py", line 391, in
task_name: get_task(task_name)()
^^^^^^^^^^^^^^^^^^^^^
File "/home/test/lm-evaluation-harness/lm_eval/base.py", line 481, in init
self.download(data_dir, cache_dir, download_mode)
File "/home/test/lm-evaluation-harness/lm_eval/base.py", line 510, in download
self.dataset = datasets.load_dataset(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 2606, in load_dataset
builder_instance = load_dataset_builder(
^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 2277, in load_dataset_builder
dataset_module = dataset_module_factory(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 1923, in dataset_module_factory
raise e1 from None
File "/home/test/miniforge3/envs/llm/lib/python3.11/site-packages/datasets/load.py", line 1875, in dataset_module_factory
can_load_config_from_parquet_export = "DEFAULT_CONFIG_NAME" not in f.read()
^^^^^^^^
File "", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte

lalalapotter · 2024-10-31T00:41:57Z

Given this issue may caused by datasets lib version. Could you please provide some information about python libs version, so that we can reproduce the issue.

tao-ov · 2024-10-31T03:16:23Z

(llm) test@test-Z590-VISION-D:~/ipexllm_whowhat/ipex-llm/python/llm/dev/benchmark/harness$ pip show datasets
DEPRECATION: Loading egg at /home/test/miniforge3/envs/llm/lib/python3.11/site-packages/whowhatbench-1.0.0-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
Name: datasets
Version: 2.21.0
Summary: HuggingFace community-driven open-source library of datasets
Home-page: https://github.com/huggingface/datasets
Author: HuggingFace Inc.
Author-email: thomas@huggingface.co
License: Apache 2.0
Location: /home/test/miniforge3/envs/llm/lib/python3.11/site-packages
Requires: aiohttp, dill, filelock, fsspec, huggingface-hub, multiprocess, numpy, packaging, pandas, pyarrow, pyyaml, requests, tqdm, xxhash
Required-by: lm_eval, optimum, optimum-intel

lalalapotter · 2024-10-31T03:21:04Z

Our verified datasets lib version is 2.14.6, could you please try it in your env. At the same time, we will reproduce the issue with the datasets version your provided.

glorysdj assigned glorysdj and lalalapotter and unassigned glorysdj Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run harness on A770 error #12290

run harness on A770 error #12290

tao-ov commented Oct 29, 2024

lalalapotter commented Oct 30, 2024

tao-ov commented Oct 30, 2024

lalalapotter commented Oct 31, 2024

tao-ov commented Oct 31, 2024

lalalapotter commented Oct 31, 2024

run harness on A770 error #12290

run harness on A770 error #12290

Comments

tao-ov commented Oct 29, 2024

lalalapotter commented Oct 30, 2024

tao-ov commented Oct 30, 2024

lalalapotter commented Oct 31, 2024

tao-ov commented Oct 31, 2024

lalalapotter commented Oct 31, 2024