https://github.com/THUDM/ChatGLM-6B/issues/6
Hardware specifications are as follows:
MacBook Pro
Model Identifier: Mac14,10
Chip: Apple M2 Pro
Total Number of Cores: 12 (8 performance and 4 efficiency)
Memory: 32 GB
My steps are as follows:
1. Download the model and modify the code
brew install git-lfs # Clone to the `chatglm` folder git clone https://huggingface.co/THUDM/chatglm-6b chatglm cd chatglm git lfs install # This will take a long time git lfs pull
Modify the modeling_chatglm.py
file and comment out the following two lines:
--- a/modeling_chatglm.py +++ b/modeling_chatglm.py @@ -1166,6 +1166,6 @@ class ChatGLMForConditionalGeneration(ChatGLMPreTrainedModel): return torch.tensor(return_seqs, dtype=torch.long, device=kwargs['input_ids'].device) def quantize(self, bits: int): - from .quantization import quantize - self.transformer = quantize(self.transformer, bits) + # from .quantization import quantize + # self.transformer = quantize(self.transformer, bits)
2. Modify the demo code
git clone https://github.com/THUDM/ChatGLM-6B.git cd ChatGLM-6B pip install -r requirements.txt pip install gradio
Modify web_demo.py
--- a/web_demo.py +++ b/web_demo.py @@ -1,8 +1,8 @@ from transformers import AutoModel, AutoTokenizer import gradio as gr -tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) -model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda() +tokenizer = AutoTokenizer.from_pretrained("../chatglm", trust_remote_code=True) +model = AutoModel.from_pretrained("../chatglm", trust_remote_code=True).float() model = model.eval()
then
python web_demo.py
After successful execution, it will display:
running on local URL: http://127.0.0.1:7860
发表回复