[PaddlePaddle/PaddleOCR]版面矫正网络DocTr++论文复现

背景

经过需求征集https://github.com/PaddlePaddle/PaddleOCR/issues/10334 和每周技术研讨会 https://github.com/PaddlePaddle/PaddleOCR/issues/10223 讨论，我们确定了DocTr++版面矫正任务，该任务在文档比对、关键字提取、合同篡改确认等重要场景发挥作用。本任务的完成能显著OCR结果的细粒度，并有众多场景应用。通过定量实验和定性对比，作者团队验证了 DocTr++ 的性能优势及泛化性，并在现有及所提出的基准测试中刷新了多项最佳记录，是目前最优的文档矫正方案。暂时没有预训练权重和训练代码，需要按照论文描述重新训练尝试。

解决步骤

根据开源代码进行网络结构、评估指标转换。代码链接：https://github.com/fh2019ustc/DocTr-Plus
结合论文复现指南，进行前反向对齐等操作，达到论文Table.1中的指标。
参考PR提交规范提交代码PR到ppocr中。

数据集：

训练数据集：获取Doc3D数据集后进行边缘裁剪，使得分成论文中的三类图片（全部包含边缘、部分包含边缘、不包含边缘）
验证数据集：Doc Unet数据集

shiyutang

The training set is extended from the classic Doc3D dataset

这个训练集是自制的，还得自己构建训练集

To construct the training set for unrestricted document image rectification, we randomly crop such distorted document images to meet one of the following three conditions, including (a) with complete document boundaries, (b) with partial document boundaries, and (c) without any document boundaries

GreatV

认领约需1个月完成

GreatV

数据集的构造已经在问题中进一步说明，有任何问题我们可以持续交流～

shiyutang

进行了论文解读，可以参考 DocTr++文档矫正.pdf

shiyutang

模型结构部分已经基本转换完成 https://github.com/GreatV/DocTrPP

等有时间了写一下训练部分

GreatV

hello,进展如何？

zhuxiaobin

@zhuxiaobin 可以看下这个PR

https://github.com/PaddlePaddle/PaddleOCR/pull/11475

GreatV

你好，进展如何？

Li-Yidong

@Li-Yidong 可以看下这个PR

https://github.com/PaddlePaddle/PaddleOCR/pull/11475

GreatV

@Li-Yidong 可以看看这个仓库 https://github.com/GreatV/DocTrPP

GreatV

@Li-Yidong 可以看看这个仓库 https://github.com/GreatV/DocTrPP

感谢分享！

Li-Yidong

[PaddlePaddle/PaddleOCR]版面矫正网络DocTr++论文复现

回答

相关问题