[THUDM/ChatGLM-6B][BUG/Help] 全参数finetuing有办法打印验证集损失值么

2024-05-21 102 views
2

求助,我想查看验证集情况,在ds_train_finetune.sh中设置--do_eval,但代码一直报错,大家是怎么解决的呢:

正常运行

Traceback (most recent call last): File "main.py", line 431, in main() File "main.py", line 397, in main predict_results = trainer.predict(predict_dataset, metric_key_prefix="predict", max_length=512, do_sample=True, top_p=0.7, temperature=0.95) File "/mnt/model/dengshuhao1/workspace/temp/ChatGLM-6B/ptuning/trainer_seq2seq.py", line 136, in predict return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix) File "/mnt/model/dengshuhao1/workspace/temp/ChatGLM-6B/ptuning/trainer.py", line 3020, in predict output = eval_loop( File "/mnt/model/dengshuhao1/workspace/temp/ChatGLM-6B/ptuning/trainer.py", line 3232, in evaluation_loop metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels)) File "main.py", line 328, in compute_metrics scores = rouge.get_scores(' '.join(hypothesis) , ' '.join(reference)) File "/mnt/model/dengshuhao1/local/envs/THUDM-ChatGLM-6B/lib/python3.8/site-packages/rouge_chinese/rouge.py", line 116, in get_scores return self._get_scores(hyps, refs) File "/mnt/model/dengshuhao1/local/envs/THUDM-ChatGLM-6B/lib/python3.8/site-packages/rouge_chinese/rouge.py", line 129, in _get_scores sc = fn( File "/mnt/model/dengshuhao1/local/envs/THUDM-ChatGLM-6B/lib/python3.8/site-packages/rouge_chinese/rouge.py", line 54, in "rouge-1": lambda hyp, ref, k: rouge_score.rouge_n(hyp, ref, 1, k), File "/mnt/model/dengshuhao1/local/envs/THUDM-ChatGLM-6B/lib/python3.8/site-packages/rouge_chinese/rouge_score.py", line 253, in rouge_n raise ValueError("Hypothesis is empty.") ValueError: Hypothesis is empty.

Environment
OS: Ubuntu 20.04
Python: 3.8
Transformers: 4.27.1
PyTorch: 2.0
CUDA Support: True

回答

9

或者有人能告诉我怎么打印每一步的训练损失和每个epoch的验证损失么,谢谢大家!!!

0

ValueError: Hypothesis is empty. 这里你预测dev时,输出出现了空值。调整一下max_lenghth,或者你的pading太多了。

7

这里你预测dev时,输出出现了空值。调整一下max_

@luolanfeixue emmm,你知道怎么打印验证集的损失么,这份代码好像只打印训练损失

0

大佬们,你们微调成功了吗,我这总是报一些奇奇怪怪的错,能不能把你们使用的包发给我啊,conda list那个,然后大概需要多大显存啊,效率怎么样,4张3090够用不

6

有办法打印验证集损失值么

5

问答对数据中存在空值,清理一下数据就可以了

9

问答对数据中存在空值,清理一下数据就可以了

这个空值是指的空格还是就是没有数据啊