- 加载模型的时候,使用 float() 可以正常加载,但是如果使用 to("mps") 也会报错
model = AutoModel.from_pretrained("chatglm-6b-int4", trust_remote_code=True).float()
File "/Users/diaojunxian/anaconda3/envs/py3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/diaojunxian/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 392, in forward output = W8A16Linear.apply(input, self.weight, self.weight_scale, self.weight_bit_width) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/diaojunxian/anaconda3/envs/py3.11/lib/python3.11/site-packages/torch/autograd/function.py", line 506, in apply return super().apply(args, **kwargs) # type: ignore[misc] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/diaojunxian/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 57, in forward weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/diaojunxian/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization.py", line 275, in extract_weight_to_half func = kernels.int4WeightExtractionHalf ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionHalf'
2. 可以在 m2 基于原始版本进行推理测试,但是真正执行 微调的时候,由于没有办法使用 gpu,想体验 torch.device("mps"),看起来无法执行?
3. 已经做过尝试的步骤,仍然无效;
![image](https://github.com/THUDM/ChatGLM-6B/assets/19700467/a3bac13b-2933-4b3c-b5e8-bf7875c2307f)
### 微调过程中遇到的问题
1. 下载 chatglm-6B-int4 的模型,基于此进行微调;
2. 运行环境:
Apple Mac M2 机器 torch==2.1.0.dev20230507
5. 执行 bash train.sh 报错
Traceback (most recent call last):
File "/Users/diaojunxian/Documents/agi/ChatGLM-6B/ptuning/main.py", line 433, in