Skip to content

Commit

Permalink
Update some links
Browse files Browse the repository at this point in the history
  • Loading branch information
jeroen committed Jan 10, 2022
1 parent b803f45 commit 0a607be
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 8 deletions.
4 changes: 2 additions & 2 deletions R/tesseract.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#' Create an OCR engine for a given language and control parameters. This can be used by
#' the [ocr] and [ocr_data] functions to recognize text.
#'
#' Tesseract [control parameters](https://github.com/tesseract-ocr/tesseract/wiki/ControlParams)
#' Tesseract [control parameters](https://tesseract-ocr.github.io/tessdoc/ControlParams)
#' can be set either via a named list in the
#' `options` parameter, or in a `config` file text file which contains the parameter name
#' followed by a space and then the value, one per line. Use [tesseract_params()] to list
Expand All @@ -21,7 +21,7 @@
#' tesseract config files that live in the tessdata directory. See details.
#' @param options a named list with tesseract parameters. See details.
#' @param cache speed things up by caching engines
#' @references [tesseract wiki: control parameters](https://github.com/tesseract-ocr/tesseract/wiki/ControlParams)
#' @references [tesseract wiki: control parameters](https://tesseract-ocr.github.io/tessdoc/ControlParams)
tesseract <- local({
store <- new.env()
function(language = "eng", datapath = NULL, configs = NULL, options = NULL, cache = TRUE){
Expand Down
13 changes: 10 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# tesseract

> Extract text from an image. Requires that you have training data for the language you are reading. Works best for images with high contrast, little noise and horizontal text.
> Bindings to [Tesseract-OCR](https://opensource.google/projects/tesseract):
a powerful optical character recognition (OCR) engine that supports over 100 languages.
The engine is highly configurable in order to tune the detection algorithms and
obtain the best possible results.

[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/tesseract)](https://cran.r-project.org/package=tesseract)
[![CRAN RStudio mirror downloads](http://cranlogs.r-pkg.org/badges/tesseract)](https://cran.r-project.org/package=tesseract)

- Upstream Tesseract-OCR documentation: https://tesseract-ocr.github.io/tessdoc/
- Introduction: https://docs.ropensci.org/tesseract/articles/intro.html
- Reference: https://docs.ropensci.org/tesseract/reference/ocr.html

## Hello World

Simple example
Expand Down Expand Up @@ -55,10 +62,10 @@ Installation from source on Linux or OSX requires the `Tesseract` library (see b
sudo apt-get install -y libtesseract-dev libleptonica-dev tesseract-ocr-eng
```

On __Ubuntu Xenial__ and __Ubuntu Bionic__ you can use this PPA to get the latest version of Tesseract:
On __Ubuntu__ you can optionally use [this PPA](https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr-devel) to get the latest version of Tesseract:

```
sudo add-apt-repository ppa:cran/tesseract
sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt-get install -y libtesseract-dev tesseract-ocr-eng
```

Expand Down
4 changes: 2 additions & 2 deletions man/tesseract.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion vignettes/intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ cat(text)

## Tesseract Control Parameters

Tesseract supports hundreds of [control parameters](https://github.com/tesseract-ocr/tesseract/wiki/ControlParams) which alter the OCR engine. Use `tesseract_params()` to list all parameters with their default value and a brief description. It also has a handy `filter` argument to quickly find parameters that match a particular string.
Tesseract supports hundreds of [control parameters](https://tesseract-ocr.github.io/tessdoc/ControlParams) which alter the OCR engine. Use `tesseract_params()` to list all parameters with their default value and a brief description. It also has a handy `filter` argument to quickly find parameters that match a particular string.

```{r}
# List all parameters with *colour* in name or description
Expand Down

0 comments on commit 0a607be

Please sign in to comment.