想问下各位已经微调过的同志,ptuning要获得较好效果需要使用多少条数据
Environment- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
想问下各位已经微调过的同志,ptuning要获得较好效果需要使用多少条数据
Environment- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
试过2W的诗歌创作数据,效果已经很不错了
如何解决遗忘问题?
我用ptuning标准代码训练以后遗忘问题很严重
import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer
CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"
载入Tokenizertokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True)
config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval()
response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history)
输出很无语
('采用
试过2W的诗歌创作数据,效果已经很不错了
请问您在训练的过程中,max_target_length和文本长度是怎么设置的
如何解决遗忘问题?
我用ptuning标准代码训练以后遗忘问题很严重
import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer
CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"
载入Tokenizertokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True)
config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval()
response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history)
输出很无语
('采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。', [('糍粑鱼怎么做', '采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来 更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。')])
使用ptuning微调会导致模型只能回答训练的任务,对其他的任务回答效果会变差,可以尝试一下LoRA微调的方法
LoRA微调的方法有试过的没,是迭代的逻辑吗
如何解决遗忘问题? 我用ptuning标准代码训练以后遗忘问题很严重 import os import torch from transformers import AutoConfig, AutoModel, AutoTokenizer CHECKPOINT_PATH = "output/adgen-chatglm-6b-pt-128-2e-2/checkpoint-3000"
载入Tokenizertokenizer = AutoTokenizer.from_pretrained("chatglm6b", trust_remote_code=True) config = AutoConfig.from_pretrained("chatglm6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("chatglm6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) print(f"Quantized to 4 bit") model = model.quantize(4) model = model.half().cuda() model.transformer.prefix_encoder.float() model = model.eval() response, history = model.chat(tokenizer, "糍粑鱼怎么做", history=[]) print(response, history) 输出很无语 ('采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。', [('糍粑鱼怎么做', '采用材料,使糍粑鱼表面变得光滑,让整条裙子看起来 更加美观,同时能够彰显你的个性。整体设计简洁大方,让裙子看起来十分大气。')])
使用ptuning微调会导致模型只能回答训练的任务,对其他的任务回答效果会变差,可以尝试一下LoRA微调的方法
lora也是的,链接
试过2W的诗歌创作数据,效果已经很不错了
请问你的train.sh的参数是怎么设置的?我尝试了1000条数据,只把lr改为2e-5,其余参数不变,但是基本什么都没有学习到,智能回答它原有的信息。