Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/allenai/olmocr into main
Browse files Browse the repository at this point in the history
  • Loading branch information
jakep-allenai committed Feb 24, 2025
2 parents 813a355 + e8387ec commit d841216
Showing 1 changed file with 18 additions and 15 deletions.
33 changes: 18 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,6 @@ pip install sgl-kernel==0.0.3.post1 --force-reinstall --no-deps
pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
```

**BETA TESTER NOTE:**

If you are a beta tester, you will need to login using the hugging-face CLI
to make sure you have access to https://huggingface.co/allenai/olmOCR-7B-0225-preview

`huggingface-cli login`

### Local Usage Example

For quick testing, try the [web demo](https://olmocr.allen.ai/). To run locally, a GPU is required, as inference is powered by [sglang](https://github.com/sgl-project/sglang) under the hood.
Expand Down Expand Up @@ -166,11 +159,21 @@ options:
```
#### TODOs for future versions
- Ask model to predict footnotes in a structured format separately
- Add training data for complex tables
- More training augmentations to improve performance
- Fix pages which are all-references sometimes rendering as empty-text
- Automated benchmarking
- More efficient inference with 8-bit KV cache
## Team
<!-- start team -->
**olmOCR** is developed and maintained by the AllenNLP team, backed by [the Allen Institute for Artificial Intelligence (AI2)](https://allenai.org/).
AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.
To learn more about who specifically contributed to this codebase, see [our contributors](https://github.com/allenai/olmocr/graphs/contributors) page.
<!-- end team -->
## License
<!-- start license -->
**olmOCR** is licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
A full copy of the license can be found [on GitHub](https://github.com/allenai/olmocr/blob/main/LICENSE).
<!-- end license -->

0 comments on commit d841216

Please sign in to comment.