tesseract
OCR (Optical Character Recognition) engine. More information: <https://github.com/tesseract-ocr/tesseract>.
Install
- All systems
-
curl cmd.cat/tesseract.sh
- Debian
-
apt-get install tesseract
- Ubuntu
-
apt-get install tesseract
- Arch Linux
-
pacman -S tesseract
- Kali Linux
-
apt-get install tesseract
- Fedora
-
dnf install tesseract
- Windows (WSL2)
-
sudo apt-get update
sudo apt-get install tesseract
- OS X
-
brew install tesseract
- Raspbian
-
apt-get install tesseract
- Dockerfile
- dockerfile.run/tesseract
OCR (Optical Character Recognition) engine. More information: <https://github.com/tesseract-ocr/tesseract>.
-
Recognize text in an image and save it to `output.txt` (the `.txt` extension is added automatically):
tesseract image.png output
-
Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):
tesseract -l deu image.png output
-
List the ISO 639-2 codes of available languages:
tesseract --list-langs
-
Specify a custom page segmentation mode (default is 3):
tesseract -psm 0_to_10 image.png output
-
List page segmentation modes and their descriptions:
tesseract --help-psm
© tl;dr; authors and contributors