We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have followed instructions to install dependencies. What should be next? I tried to run pdf_validation.py, but what it gives?
###### Process: end2end_quick_match Downloading builder script: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.94k/5.94k [00:00<00:00, 11.2MB/s] Downloading extra modules: 4.07kB [00:00, 6.57MB/s] Downloading extra modules: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.34k/3.34k [00:00<00:00, 10.3MB/s] Downloading builder script: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7.02k/7.02k [00:00<00:00, 20.7MB/s] [nltk_data] Downloading package wordnet to /home/jupyter/nltk_data... [nltk_data] Downloading package punkt_tab to [nltk_data] /home/jupyter/nltk_data... [nltk_data] Unzipping tokenizers/punkt_tab.zip. [nltk_data] Downloading package omw-1.4 to /home/jupyter/nltk_data... 【text_block】 Edit_dist: ------------ -------- ALL_page_avg 0.356126 ------------ -------- ==================================================================================================== BLEU: --- -------- all 0.269879 --- -------- ==================================================================================================== METEOR: --- -------- all 0.110879 --- -------- ==================================================================================================== ----Anno Attribute--------------- Edit_dist: -------------------------------------- --------- text_background: multi_colored 0.360388 text_background: single_colored 0.340771 text_background: white 0.531482 text_language: text_en_ch_mixed 0.318496 text_language: text_english 0.0934457 text_language: text_simplified_chinese 0.734819 text_rotate: horizontal 0.75 text_rotate: normal 0.512192 -------------------------------------- --------- ==================================================================================================== sample_count: -------------------------------------- --- text_background: multi_colored 16 text_background: single_colored 10 text_background: white 208 text_language: text_en_ch_mixed 7 text_language: text_english 76 text_language: text_simplified_chinese 149 text_rotate: horizontal 2 text_rotate: normal 230 -------------------------------------- --- ==================================================================================================== Edit_dist: -------------------------------- ---------- ALL 0.356126 None 0.58769 colorful_backgroud 0.0645602 data_source: PPT2PDF 0.00378788 data_source: academic_literature 0.123349 data_source: book 0.430496 data_source: colorful_textbook 0.490101 data_source: exam_paper 0.0387346 data_source: magazine 0.362045 data_source: newspaper 0.562052 data_source: note 0.581583 data_source: research_report 0.612982 fuzzy_scan 0.490101 language: en_ch_mixed 0.337711 language: english 0.0726445 language: simplified_chinese 0.556404 layout: 1andmore_column 0.340281 layout: double_column 0.552618 layout: other_layout 0.741007 layout: single_column 0.212567 layout: three_column 0.124104 -------------------------------- ---------- ==================================================================================================== 【display_formula】 Edit_dist: ------------ -------- ALL_page_avg 0.492899 ------------ -------- ==================================================================================================== ----Anno Attribute--------------- Edit_dist: ------------------- -------- formula_type: print 0.419252 ------------------- -------- ==================================================================================================== sample_count: ------------------- -- formula_type: print 17 ------------------- -- ==================================================================================================== Edit_dist: -------------------------------- -------- ALL 0.492899 None 0.435644 colorful_backgroud 0.550154 data_source: academic_literature 0.435644 data_source: exam_paper 0.550154 language: english 0.492899 layout: single_column 0.492899 -------------------------------- -------- ==================================================================================================== 【table】 TEDS: --- -------- all 0.783813 --- -------- ==================================================================================================== TEDS_structure_only: --- -------- all 0.911589 --- -------- ==================================================================================================== Edit_dist: ------------ -------- ALL_page_avg 0.202719 ------------ -------- ==================================================================================================== ----Anno Attribute--------------- Edit_dist: ---------------------------------- -------- include_background: False 0.241423 include_background: True 0.163629 include_equation: False 0.2225 include_equation: True 0.122632 include_photo: False 0.202526 language: table_en 0.256255 language: table_simplified_chinese 0.179499 line: fewer_line 0.174262 line: full_line 0.244128 line: less_line 0.177442 table_layout: horizontal 0.202526 with_span: False 0.238261 with_span: True 0.166791 with_structured_text: False 0.202526 ---------------------------------- -------- ==================================================================================================== TEDS: ---------------------------------- -------- include_background: False 0.774105 include_background: True 0.79352 include_equation: False 0.771271 include_equation: True 0.833977 include_photo: False 0.783813 language: table_en 0.858176 language: table_simplified_chinese 0.751942 line: fewer_line 0.762137 line: full_line 0.819911 line: less_line 0.747795 table_layout: horizontal 0.783813 with_span: False 0.734281 with_span: True 0.833344 with_structured_text: False 0.783813 ---------------------------------- -------- ==================================================================================================== TEDS_structure_only: ---------------------------------- -------- include_background: False 0.907024 include_background: True 0.916153 include_equation: False 0.89667 include_equation: True 0.971264 include_photo: False 0.911589 language: table_en 0.980843 language: table_simplified_chinese 0.881908 line: fewer_line 0.928188 line: full_line 0.928922 line: less_line 0.759259 table_layout: horizontal 0.911589 with_span: False 0.920635 with_span: True 0.902542 with_structured_text: False 0.911589 ---------------------------------- -------- ==================================================================================================== sample_count: ---------------------------------- -- include_background: False 5 include_background: True 5 include_equation: False 8 include_equation: True 2 include_photo: False 10 language: table_en 3 language: table_simplified_chinese 7 line: fewer_line 5 line: full_line 4 line: less_line 1 table_layout: horizontal 10 with_span: False 5 with_span: True 5 with_structured_text: False 10 ---------------------------------- -- ==================================================================================================== Edit_dist: -------------------------------- ---------- ALL 0.202719 colorful_backgroud 0.0837313 data_source: academic_literature 0.217742 data_source: book 0.177442 data_source: colorful_textbook 0.00746269 data_source: exam_paper 0.16 data_source: magazine 0.0275229 data_source: note 0.404524 data_source: research_report 0.212628 fuzzy_scan 0.00746269 language: en_ch_mixed 0.543561 language: english 0.128402 language: simplified_chinese 0.179141 layout: 1andmore_column 0.167004 layout: double_column 0.217742 layout: other_layout 0.0275229 layout: single_column 0.24904 -------------------------------- ---------- ==================================================================================================== TEDS: -------------------------------- -------- ALL 0.800119 colorful_backgroud 0.940929 data_source: academic_literature 0.702065 data_source: book 0.747795 data_source: colorful_textbook 0.999505 data_source: exam_paper 0.882353 data_source: magazine 0.965889 data_source: note 0.698893 data_source: research_report 0.752838 fuzzy_scan 0.999505 language: en_ch_mixed 0.872959 language: english 0.861308 language: simplified_chinese 0.748837 layout: 1andmore_column 0.759705 layout: double_column 0.702065 layout: other_layout 0.965889 layout: single_column 0.802741 -------------------------------- -------- ==================================================================================================== TEDS_structure_only: -------------------------------- -------- ALL 0.914552 colorful_backgroud 0.941176 data_source: academic_literature 0.942529 data_source: book 0.759259 data_source: colorful_textbook 1 data_source: exam_paper 0.882353 data_source: magazine 1 data_source: note 0.916667 data_source: research_report 0.906746 fuzzy_scan 1 language: en_ch_mixed 1 language: english 0.941627 language: simplified_chinese 0.881217 layout: 1andmore_column 0.883637 layout: double_column 0.942529 layout: other_layout 1 layout: single_column 0.904233 -------------------------------- -------- ==================================================================================================== 【reading_order】 Edit_dist: ------------ -------- ALL_page_avg 0.216996 ------------ -------- ==================================================================================================== ----Anno Attribute--------------- sample_count: ==================================================================================================== Edit_dist: -------------------------------- --------- ALL 0.216996 None 0.455182 colorful_backgroud 0.025641 data_source: PPT2PDF 0 data_source: academic_literature 0.209559 data_source: book 0.25 data_source: colorful_textbook 0.285714 data_source: exam_paper 0.0769231 data_source: magazine 0.1 data_source: newspaper 0.5 data_source: note 0.0384615 data_source: research_report 0.492308 fuzzy_scan 0.285714 language: en_ch_mixed 0.0769231 language: english 0.081852 language: simplified_chinese 0.325604 layout: 1andmore_column 0.376923 layout: double_column 0.299107 layout: other_layout 0.6 layout: single_column 0.0839618 layout: three_column 0 -------------------------------- --------- ==================================================================================================== (py38) jupyter@instance-20250308-023555:~/OmniDocBench$
The text was updated successfully, but these errors were encountered:
Use this jupyer to extract the results and organize them into a tabular format.
Sorry, something went wrong.
No branches or pull requests
I have followed instructions to install dependencies. What should be next? I tried to run pdf_validation.py, but what it gives?
The text was updated successfully, but these errors were encountered: