Skip to content

pdfOCR 1.0.0

Compare
Choose a tag to compare
@Snipx Snipx released this 26 Jun 10:42
· 135 commits to develop since this release
1.0.0

We are proud to announce the first release of pdfOCR, the newest addition to our iText 7 Suite, which enables you to OCR your images into fully ISO-compliant PDF or PDF/A-3u files, making it possible to access and process the text they contain.

Given that we rely on the open-source Tesseract 4.x project to do the heavy lifting, we couldn't, in conscience, not make this add-on open source as well.

You may also notice that we have split up the project in two. We have an API module and the implementation module for Tesseract. In essence this means that you can hook up other OCR engines to iText, but it also means that we're not closing the door on taking on more options for our users to choose from.