-
Notifications
You must be signed in to change notification settings - Fork 2.9k
-
Star 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问一下这里的use segment_box是什么意思?xfund有相应的实现么? #6262
请问一下这里的use segment_box是什么意思?xfund有相应的实现么? #6262
Comments
linjieccc
commented
Jun 28, 2023
数据集中包含两种bbox,一种是字符粒度的bbox,另外一种是片段粒度的bbox(segment_box)。segment粒度对NER效果提升大 |
aixuedegege
commented
Jun 28, 2023
•
edited
Loading
edited
谢谢您的回答,我可视化后看segment_box画出来的值对应不上相应的文字区域的segment,好像是硬用1000规范化后的box,但是我没有找到对应的代码能完整的将segment_box恢复到原有图片对应的文字块box,如图下面红色的bbox是一个segment_box的框,对应不到任何字块: |
aixuedegege
commented
Jun 29, 2023
@linjieccc 可以帮我再看看是什么问题么 |
linjieccc
commented
Jun 29, 2023
ernie-layout输入会按照1000,1000对原始bbox进行normalization,可以根据原始图片的宽高进行还原,参考这里 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/utils/doc_parser.py#L272 |
aixuedegege
commented
Jun 29, 2023
•
edited
Loading
edited
那个是application中uie-x的数据处理逻辑,我看到了,您看一下这段ernielayout ner的处理代码 248-257行, 我的疑问是 1、 redame 中的XFUND-ZH训练下面use_segment_box是打开的,segment_box这里打开了,那取出来的应该是normalization后的segment bbox,为何下面又调用了_scale_same_as_image,这样不是做了两遍normalization了么? 2、你们处理xfund的segment bbox是有什么方法,是已经normalization的了吧,因为对应不上文字,如上面我提到的,这个处理过程可以开源么? 3、uie-x和ernielayout有什么具体区别啊,我的理解是uiex是再ernielayout上加了两个start end指针,ernielayout的问答也是这个原理么? 再次感谢您的抽空回答 |
aixuedegege
commented
Jul 3, 2023
@linjieccc 可以帮我解答一下么 |
linjieccc
commented
Jul 3, 2023
•
edited
Loading
edited
@aixuedegege
def _normalize_box(box, old_size, new_size):
"""normalize box"""
return [
int(box[0] * new_size[0] / old_size[0]),
int(box[1] * new_size[1] / old_size[1]),
int(box[2] * new_size[0] / old_size[0]),
int(box[3] * new_size[1] / old_size[1]),
]
new_box = _normalize_box(old_box, [1000, 1000], [img_w, img_h])
|
aixuedegege
commented
Jul 4, 2023
thanks a lot! |
PaddleNLP/model_zoo/ernie-layout/README_ch.md
Line 249 in fda38e4
The text was updated successfully, but these errors were encountered: