[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

potong · 2023-08-29T03:15:57Z

Required prerequisites

I have read the documentation https://github.com/baichuan-inc/baichuan-7B/blob/HEAD/README.md.
I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
Consider asking first in a Discussion.

Questions

支持Baichuan-7B 原生部署、 int8 和 int4 量化部署，代码如下：

import os
import platform
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

def auto_configure_device_map(num_gpus: int):
    num_trans_layers = 32
    per_gpu_layers = num_trans_layers / num_gpus
    device_map = {'model.embed_tokens': 0,
    'model.norm': num_gpus-1, 'lm_head': num_gpus-1}
    for i in range(num_trans_layers):
        device_map[f'model.layers.{i}'] = int(i//per_gpu_layers)

    return device_map


MODEL_NAME = "baichuan-inc/baichuan-7B"

NUM_GPUS = torch.cuda.device_count() if torch.cuda.is_available() else None
device_map = auto_configure_device_map(NUM_GPUS) if NUM_GPUS > 0 else None
device = torch.device("cuda") if NUM_GPUS > 0 else torch.device("cpu")
device_dtype = torch.half if NUM_GPUS > 0 else torch.float

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.float16, trust_remote_code=True).quantize(8) # 当前是int8量化；需要int4量化，只需将8改为4即可；需要原生部署，去掉.quantize(8)即可
model = dispatch_model(model, device_map=device_map)
model = model.eval()

感谢 #50 中小伙伴们提供的宝贵方法。

Checklist

I have provided all relevant and necessary information above.
I have chosen a suitable title for this issue.

The text was updated successfully, but these errors were encountered:

potong added the question Further information is requested label Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

potong commented Aug 29, 2023

[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

[Question] Baichuan-7B多GPU 原生部署、 int8 和 int4 量化部署 #127

Comments

potong commented Aug 29, 2023

Required prerequisites

Questions

支持Baichuan-7B 原生部署、 int8 和 int4 量化部署，代码如下：

感谢 #50 中小伙伴们提供的宝贵方法。

Checklist