[THUDM/ChatGLM-6B][BUG/Help] 加载chatglm-6b-int4微调模型报错：size mismatch for embedding.weight

首先我用train_chat.sh对chatglm-6b-int4模型进行训练。

然后我尝试通过 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 的方法来加载微调后的模型。

在执行model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)的时候报错：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PrefixEncoder:
    size mismatch for embedding.weight: copying a param with shape torch.Size([8, 229376]) from checkpoint, the shape in current model is torch.Size([128, 229376]).

root@VM-32-16-ubuntu:/mnt/nfs/ChatGLM-6B/ptuning# python3
Python 3.10.6 (main, Nov  2 2022, 18:53:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import torch

>>> from transformers import AutoConfig, AutoModel, AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("/root/chatglm-6b-int4", trust_remote_code=True)
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
>>> config = AutoConfig.from_pretrained("/root/chatglm-6b-int4", trust_remote_code=True, pre_seq_len=128)
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
>>> model = AutoModel.from_pretrained("/root/chatglm-6b-int4", config=config, trust_remote_code=True)
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : /root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization_kernels_parallel.c -shared -o /root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization_kernels_parallel.so
Load kernel : /root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 6
Using quantization cache
Applying quantization to glm layers
Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /root/chatglm-6b-int4 and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
>>> prefix_state_dict = torch.load("/mnt/nfs/chatglm_checkpoint/checkpoint-40/pytorch_model.bin")
>>> new_prefix_state_dict = {}
>>> for k, v in prefix_state_dict.items():
...     if k.startswith("transformer.prefix_encoder."):
...         new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v
... 
>>> new_prefix_state_dict
{'embedding.weight': tensor([[ 1.9268,  1.4873,  0.9009,  ..., -0.9419, -0.3421, -0.3513],
        [-0.4839, -0.1821,  1.0518,  ...,  0.8515, -2.6099, -0.2716],
        [ 0.6895, -0.0231, -0.3374,  ..., -1.5180, -0.3101,  1.9832],
        ...,
        [ 0.2471, -0.4341,  0.2673,  ..., -0.4657, -0.3695,  0.4011],
        [-0.2043, -0.4939, -1.4922,  ..., -0.0732, -0.6814, -2.1821],
        [ 1.5078,  1.1973, -0.9023,  ...,  0.3872, -0.8471,  0.8122]],
       device='cuda:0')}
>>> model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PrefixEncoder:
    size mismatch for embedding.weight: copying a param with shape torch.Size([8, 229376]) from checkpoint, the shape in current model is torch.Size([128, 229376]).
>>>

Environment

- OS: Ubuntu 22.04
- Python: Python 3.10.6
- Transformers: 4.27.1
- PyTorch: 2.0.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

chn-lee-yumi

把 pre_seq_len 改成 8

duzx16

注意你可能需要将 pre_seq_len 改成你训练时的实际值。

把pre_seq_len改成 8 size mismatch for embedding.weight: copying a param with shape torch.Size([8, 229376]) from checkpoint, the shape in current model is torch.Size([8, 4096]). 请问您遇到过这个问题吗？求教

MrWuzy1994

请问解决了吗

CyanMystery

谢谢，我也是重新开始训练了估计是checkpoint跟现在改的参数对不上

CyanMystery

[THUDM/ChatGLM-6B][BUG/Help] 加载chatglm-6b-int4微调模型报错：size mismatch for embedding.weight

回答

相关问题