evaluate.sh内容: PRE_SEQ_LEN=128 CHECKPOINT=viewgen0421-chatglm-6b-pt-128-2e-2 STEP=5000
CUDA_VISIBLE_DEVICES=1 python3 main.py \ --do_predict \ --validation_file /home/workspace/data/dev.json \ --test_file /home/workspace/data/dev.json \ --overwrite_cache \ --prompt_column content \ --response_column summary \ --model_name_or_path /home/workspace/chatglm/chatglm-6B \ --ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP \ --output_dir ./output/$CHECKPOINT \ --overwrite_output_dir \ --max_source_length 512 \ --max_target_length 512 \ --per_device_eval_batch_size 1 \ --predict_with_generate \ --pre_seq_len $PRE_SEQ_LEN \ --quantization_bit 4
日志输出warning: Input length of input_ids is 512, but max_length is set to 512. This can lead to unexpected behavior. You should consider increasing max_new_tokens.
导致 rouge.get_scores报错 ValueError: Hypothesis is empty. https://github.com/THUDM/ChatGLM-6B/blob/aeced3619b804d20d2396576f6d5bc8dc8226913/ptuning/main.py#L328
尝试调整max_length =1025 ,可以修复这个问题 https://github.com/THUDM/ChatGLM-6B/blob/aeced3619b804d20d2396576f6d5bc8dc8226913/ptuning/main.py#L397
请问这个原因是啥?
evaluate.sh入参 --max_source_length 512 --max_target_length 512 可以触发
Environment- OS: centos 8
- Python:3.9
- Transformers:4.26.1
- PyTorch:1.12
- CUDA Support True