hocr2djvused
tool to perform OCR on DjVu documents
Install
- All systems
-
curl cmd.cat/hocr2djvused.sh
- Debian
-
apt-get install ocrodjvu
- Ubuntu
-
apt-get install ocrodjvu
- Kali Linux
-
apt-get install ocrodjvu
- Windows (WSL2)
-
sudo apt-get update
sudo apt-get install ocrodjvu
- Raspbian
-
apt-get install ocrodjvu
- Dockerfile
- dockerfile.run/hocr2djvused
ocrodjvu
tool to perform OCR on DjVu documents
Ocrodjvu is a wrapper around the Optical Character Recognition (OCR) systems Cuneiform, Gocr, Ocrad, OCRopus and (standalone) Tesseract. It is designed for OCR on documents in DjVu format, which is especially suited for high-quality archiving of books. After processing, the DjVu document embeds a text layer. Other programs can then be used to read the document, search it for specific terms, print it out, or use the information in the OCR layer as a way to improve the document's accessibility.