diff --git a/README.md b/README.md index 6089f7dc7d2..33aa7e00454 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ - **[New]** `bigdl-llm` now supports QLoRA fintuning on Intel GPU; see the the example [here](python/llm/example/GPU/QLoRA-FineTuning). - `bigdl-llm` now supports Intel GPU (including Arc, Flex and MAX); see the the latest GPU examples [here](python/llm/example/GPU). - `bigdl-llm` tutorial is released [here](https://github.com/intel-analytics/bigdl-llm-tutorial). -- Over 20 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, Mistral, Falcon, MPT, Dolly, StarCoder, Whisper, Baichuan, InternLM, QWen, Aquila, MOSS,* and more; see the complete list [here](python/llm/README.md#verified-models). +- Over 20 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, Mistral, Falcon, MPT, Dolly, StarCoder, Whisper, Baichuan, InternLM, QWen, Aquila, MOSS,* and more; see the complete list [here](#verified-models). ### `bigdl-llm` Demos See the ***optimized performance*** of `chatglm2-6b` and `llama-2-13b-chat` models on 12th Gen Intel Core CPU and Intel Arc GPU below. @@ -52,6 +52,7 @@ See the ***optimized performance*** of `chatglm2-6b` and `llama-2-13b-chat` mode - [CPU INT4](#cpu-int4) - [GPU INT4](#gpu-int4) - [More Low-Bit Support](#more-low-bit-support) +- [Verified Models](#verified-models) #### CPU INT4 ##### Install @@ -125,6 +126,31 @@ model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_low_bit= ``` *See the complete example [here](python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types).* +#### Verified Models +Over 20 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, Mistral, Falcon, MPT, Baichuan/Baichuan2, InternLM, QWen* and more; see the example list below. + +| Model | CPU Example | GPU Example | +|------------|----------------------------------------------------------------|-----------------------------------------------------------------| +| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) |[link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna)| +| LLaMA 2 | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2) | +| ChatGLM | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm) | | +| ChatGLM2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2) | +| Mistral | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral) | +| Falcon | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon) | +| MPT | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt) | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt) | +| Dolly-v1 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1) | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1) | +| Dolly-v2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2) | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2) | +| Replit Code| [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit) | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit) | +| RedPajama | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) | | +| Phoenix | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix) | | +| StarCoder | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder) | +| Baichuan | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan) | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan) | +| Baichuan2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2) | +| InternLM | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm) | +| Qwen | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen) | +| Aquila | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila) | +| MOSS | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss) | | +| Whisper | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper) | ***For more details, please refer to the `bigdl-llm` [Document](https://test-bigdl-llm.readthedocs.io/en/main/doc/LLM/index.html), [Readme](python/llm), [Tutorial](https://github.com/intel-analytics/bigdl-llm-tutorial) and [API Doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/index.html).*** diff --git a/docs/readthedocs/source/index.rst b/docs/readthedocs/source/index.rst index 8a08d6ccc8e..85a1ad5c16c 100644 --- a/docs/readthedocs/source/index.rst +++ b/docs/readthedocs/source/index.rst @@ -27,7 +27,7 @@ Latest update - **[New]** ``bigdl-llm`` now supports QLoRA fintuning on Intel GPU; see the the example `here `_. - ``bigdl-llm`` now supports Intel GPU (including Arc, Flex and MAX); see the the latest GPU examples `here `_. - ``bigdl-llm`` tutorial is released `here `_. -- Over 20 models have been verified on ``bigdl-llm``, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, MPT, Falcon, Dolly, StarCoder, Whisper, InternLM, QWen, Baichuan, Aquila, MOSS* and more; see the complete list `here `_. +- Over 20 models have been verified on ``bigdl-llm``, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, MPT, Falcon, Dolly, StarCoder, Whisper, InternLM, QWen, Baichuan, Aquila, MOSS* and more; see the complete list `here `_. ============================================