[PaddlePaddle/PaddleOCR]PP-OCRv3 文本检测模型训练的时候。教师模型训练不生成best_accuracy文件夹

2024-05-14 317 views
5
  • 系统环境/System Environment: ubuntu20.04
  • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components: paddle 1.0.2 paddle-bfloat 0.1.2 paddle2onnx 0.9.7 paddlefsl 1.1.0 paddlenlp 2.4.1 paddleocr 2.6.1.2 paddlepaddle-gpu 2.4.1.post116 pandas 1.1.5 pandocfilters 1.5.0
  • 运行指令/Command Code: 命令:
    # 单卡训练
    python3 tools/train.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_dml_zzszyfp.yml \
    -o Architecture.Models.Student.pretrained=./pretrain_models/ResNet50_vd_ssld_pretrained \
    Architecture.Models.Student2.pretrained=./pretrain_models/ResNet50_vd_ssld_pretraine

    配置文件ch_PP-OCRv3_det_dml_zzszyfp: 只是把epoch_num和eval_batch_step 改小了 120次。1200太久了

    
    Global:
    use_gpu: true
    #epoch_num: 1200
    epoch_num: 120
    log_smooth_window: 20
    print_batch_step: 2
    save_model_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/model/det/output/ch_db_mv3/
    #save_epoch_step: 1200
    save_epoch_step: 1200
    # evaluation is run every 5000 iterations after the 4000th iteration
    # eval_batch_step: [3000, 2000]
    eval_batch_step: [120, 120]
    cal_metric_during_train: False
    pretrained_model: ./pretrain_models/MobileNetV3_large_x0_5_pretrained
    checkpoints:
    save_inference_dir:
    use_visualdl: False
    infer_img: doc/imgs_en/img_10.jpg
    save_res_path: ./output/det_db/predicts_db.txt

Architecture: name: DistillationModel algorithm: Distillation model_type: det Models: Student: return_all_feats: false model_type: det algorithm: DB Backbone: name: ResNet_vd in_channels: 3 layers: 50 Neck: name: LKPAN out_channels: 256 Head: name: DBHead kernel_list: [7,2,2] k: 50 Student2: return_all_feats: false model_type: det algorithm: DB Backbone: name: ResNet_vd in_channels: 3 layers: 50 Neck: name: LKPAN out_channels: 256 Head: name: DBHead kernel_list: [7,2,2] k: 50

Loss: name: CombinedLoss loss_config_list:

  • DistillationDMLLoss: model_name_pairs:
    • ["Student", "Student2"] maps_name: "thrink_maps" weight: 1.0 act: None

      model_name_pairs: ["Student", "Student2"] key: maps

  • DistillationDBLoss: weight: 1.0 model_name_list: ["Student", "Student2"] key: maps

    name: DBLoss balance_loss: true main_loss_type: DiceLoss alpha: 5 beta: 10 ohem_ratio: 3

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 lr: name: Cosine learning_rate: 0.001 warmup_epoch: 2 regularizer: name: 'L2' factor: 0

PostProcess: name: DistillationDBPostProcess model_name: ["Student", "Student2"] key: head_out thresh: 0.3 box_thresh: 0.6 max_candidates: 1000 unclip_ratio: 1.5

Metric: name: DistillationMetric base_metric_name: DetMetric main_indicator: hmean key: "Student"

Train: dataset: name: SimpleDataSet data_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det label_file_list:

  • /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det/train.txt ratio_list: [1.0] transforms:
  • DecodeImage: # load image img_mode: BGR channel_first: False
  • DetLabelEncode: # Class handling label
  • CopyPaste:
  • IaaAugment: augmenter_args:
    • { 'type': Fliplr, 'args': { 'p': 0.5 } }
    • { 'type': Affine, 'args': { 'rotate': [-10, 10] } }
    • { 'type': Resize, 'args': { 'size': [0.5, 3] } }
  • EastRandomCropData: size: [960, 960] max_tries: 50 keep_ratio: true
  • MakeBorderMap: shrink_ratio: 0.4 thresh_min: 0.3 thresh_max: 0.7
  • MakeShrinkMap: shrink_ratio: 0.4 min_text_size: 8
  • NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: 'hwc'
  • ToCHWImage:
  • KeepKeys: keep_keys: ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask'] # the order of the dataloader list loader: shuffle: True drop_last: False batch_size_per_card: 8

    batch_size_per_card: 2 num_workers: 4

Eval: dataset: name: SimpleDataSet data_dir: /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det label_file_list:

  • /home/DiskA/zncsPython/picture_ocr/zzszyfp_v1/split_data/det/test.txt transforms:
  • DecodeImage: # load image img_mode: BGR channel_first: False
  • DetLabelEncode: # Class handling label
  • DetResizeForTest: image_shape: [736, 1280]
  • NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: 'hwc'
  • ToCHWImage:
  • KeepKeys: keep_keys: ['image', 'shape', 'polys', 'ignore_tags'] loader: shuffle: False drop_last: False batch_size_per_card: 1 # must be 1 num_workers: 2


- 完整报错/Complete Error Message:
- 模型不生成best_accuracy 
![image](https://user-images.githubusercontent.com/32863094/216932790-70cd4ab9-f092-4772-bcc2-d24e1e3d05a3.png)
![image](https://user-images.githubusercontent.com/32863094/216932791-46d33189-8a71-42a4-b8ab-29d52a517a76.png)

回答

7

eval_batch_step: [120, 120] 改成[0, 20]

7

eval_batch_step: [120, 120] 改成[0, 20]

还是没生成

7

eval_batch_step

@LDOUBLEV

5

改之后,训练期间执行eval了吗?

image
3

@LDOUBLEV 好像没有

3

我现在就是在训练教师模型。测试数据确实没有。我加下试下

0

PP-OCRv3检测训练 学生训练完。是不是把best_accuracy导出成finetune模型就行了? image

0

@LDOUBLEV 提取的Teacher结构 在训练的时候怎么用上呢? image

2

DML策略训练完教师模型后,提取教师模型作为CML配置训练的Teacher的预训练模型

9

DML策略训练完教师模型后,提取教师模型作为CML配置训练的Teacher的预训练模型

是的。我已经把学生模型也训练了。但是我不知道这个是怎么跟第三步“基于PP-OCRv3检测finetune训练” 联系在一块的 真正用的时候是用Teacher模型吧?

4

@LDOUBLEV

6

PP-OCRv3 蒸馏检测模型训练步骤是这样的:

  1. 训练一个精度更高的大模型,Teacher,采用DML配置训练的;
  2. 第一步得到的Teacher作为CML蒸馏的教师模型,采用CML蒸馏方法指导一个Student模型的训练
  3. 得到一个精度高,模型更小的Student模型 如果你的部署方案对模型没限制,第一步DML得到的Teacher模型就够用了,精度也更高
3

3. Teacher

好的。我只要把第一步DML得到的Teacher模型导成inference 模型。就能传到kie预测模型中了是吧? image

8

@LDOUBLEV