We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好,在试用pdf解析时,有问题想请教一下: 1.从category_id的类别上看,"category_id":1是plain_text正文自然段文本,"category_id":5中的latex是表格文本,但是我看到解析结果的json文件,发现"category_id":1没有text文本,只有"category_id":15的ocr_text的text文本,ocr_text是否可以理解为除去表格内容等的纯正文文本呢? 2.解析后的json文件ocr_text文本的上下文顺序与原文有些不同(不是多列),是否存在坐标排序? 感谢
The text was updated successfully, but these errors were encountered:
Sorry, something went wrong.
No branches or pull requests
你好,在试用pdf解析时,有问题想请教一下:
1.从category_id的类别上看,"category_id":1是plain_text正文自然段文本,"category_id":5中的latex是表格文本,但是我看到解析结果的json文件,发现"category_id":1没有text文本,只有"category_id":15的ocr_text的text文本,ocr_text是否可以理解为除去表格内容等的纯正文文本呢?
2.解析后的json文件ocr_text文本的上下文顺序与原文有些不同(不是多列),是否存在坐标排序?
感谢
The text was updated successfully, but these errors were encountered: