[THUDM/ChatGLM-6B]采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

1、用自己的数据微调INT4模型 2、使用web_demo部署微调模型 3、执行结果

Environment

- OS:centos
- Python:3.10
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

dwq370

一模一样的问题，而且我用的数据还是示例数据，就是那个生成广告的数据集

直接使用原模型秒出回答，但是微调后即使是用数据集里的问题，也是等待200多秒没反应，终于等到有反应了也是回答为空

用户：类型#裤版型#显瘦颜色#黑色风格#简约裤长#九分裤

ChatGLM-6B：

用户：类型#裤材质#羊毛裤长#九分裤*裤口#微喇裤

ChatGLM-6B：

用户：

intothephone

使用原模型，进程会加载checkpoint

使用微调模型，会提示有些参数无法从checkpoint中加载。

应该问题出在这里，但不知道该如何解决

dwq370

同问

hhhparty

参考https://www.heywhale.com/mw/project/6436d82948f7da1fee2be59e中的做法

执行出现新的错误

AttributeError: 'ChatGLMModel' object has no attribute 'prefix_encoder'

dwq370

问题已解决

添加

config = AutoConfig.from_pretrained("./ptuning/THUDM/chatglm-6b", trust_remote_code=True) config.pre_seq_len = 64 model = AutoModel.from_pretrained("./ptuning/THUDM/chatglm-6b", config=config, trust_remote_code=True)

pre_seq_len和训练设置的一致，就可以运行

dwq370

你好，请问一下我也遇到了类似的情况，按照您上面说的做了，但还是空回复，请问您是怎么解决的呢，具体代码如下： import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer

CheckPoint_Path = "/root/autodl-tmp/ChatGLM-6B/ptuning/output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"

载入Tokenizer

tokenizer = AutoTokenizer.from_pretrained("/root/autodl-tmp/modal/chatglm-6b", trust_remote_code=True) config = AutoConfig.from_pretrained("/root/autodl-tmp/modal/chatglm-6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained(CheckPoint_Path, config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CheckPoint_Path, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval()

response, history = model.chat(tokenizer, "你好", history=[]) print(response)

lanchongdashygo

第3行 model = AutoModel.from_pretrained(CheckPoint_Path, config=config, trust_remote_code=True) 改成 model = AutoModel.from_pretrained(“/root/autodl-tmp/modal/chatglm-6b”, config=config, trust_remote_code=True) 就可以了

dwq370

@dwq370 你这个是只加载新 Checkpoint（只包含 PrefixEncoder 参数），楼主的问题是加载的是旧 Checkpoint（包含 ChatGLM-6B 以及 PrefixEncoder 参数）遇到不出结果的问题，我也是这样

DanLiu0623

使用原模型，进程会加载checkpoint

使用微调模型，会提示有些参数无法从checkpoint中加载。

应该问题出在这里，但不知道该如何解决

直接加载微调的checkpoint,没有响应，提示无法加载一些参数，问题您找到了吗

DanLiu0623

估计用prefixEncoder的训练方式，需要先加载源模型再加载prefixEncoder。如果直接加载微调的checkpoint，训练时将参数pre_seq_len去掉试试，如果计算资源不足，可能需要使用deepspeed来微调。

我的机器上跑deepspeed总是失败，所以只能用prefixEncoder的方式。

dwq370

问题已解决

添加

config = AutoConfig.from_pretrained("./ptuning/THUDM/chatglm-6b", trust_remote_code=True) config.pre_seq_len = 64 model = AutoModel.from_pretrained("./ptuning/THUDM/chatglm-6b", config=config, trust_remote_code=True)

pre_seq_len和训练设置的一致，就可以运行

大佬你这是部署的训练之后的模型吗

254288008

[THUDM/ChatGLM-6B]采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

回答

相关问题