tesseract

OCR (Optical Character Recognition) engine. More information: <https://github.com/tesseract-ocr/tesseract>.

Install

All systems
curl cmd.cat/tesseract.sh
Debian Debian
apt-get install tesseract
Ubuntu
apt-get install tesseract
Arch Arch Linux
pacman -S tesseract
image/svg+xml Kali Linux
apt-get install tesseract
Fedora
dnf install tesseract
Windows (WSL2)
sudo apt-get update sudo apt-get install tesseract
OS X
brew install tesseract
Raspbian
apt-get install tesseract

OCR (Optical Character Recognition) engine. More information: <https://github.com/tesseract-ocr/tesseract>.

  • Recognize text in an image and save it to `output.txt` (the `.txt` extension is added automatically):
    tesseract image.png output
  • Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):
    tesseract -l deu image.png output
  • List the ISO 639-2 codes of available languages:
    tesseract --list-langs
  • Specify a custom page segmentation mode (default is 3):
    tesseract -psm 0_to_10 image.png output
  • List page segmentation modes and their descriptions:
    tesseract --help-psm

© tl;dr; authors and contributors