dumppdf

PDF parser and analyser (Python3)

Install

All systems
curl cmd.cat/dumppdf.sh
Debian Debian
apt-get install python3-pdfminer
Ubuntu
apt-get install python3-pdfminer
image/svg+xml Kali Linux
apt-get install python3-pdfminer
Fedora
dnf install python3-pdfminer
Windows (WSL2)
sudo apt-get update sudo apt-get install python3-pdfminer
Raspbian
apt-get install python-pdfminer

python3-pdfminer

PDF parser and analyser (Python3)

PDFMiner is a tool for extracting information from PDF documents, which focuses entirely on getting and analyzing text data. It allows one to obtain the exact location of text portions in a page, as well as other information such as fonts or lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purposes than text analysis. This package provides the Python3 module and the command-line tools: pdf2txt, dumppdf and latin2ascii.

python-pdfminer

PDF parser and analyser

PDFMiner is a tool for extracting information from PDF documents, which focuses entirely on getting and analyzing text data. It allows one to obtain the exact location of text portions in a page, as well as other information such as fonts or lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purposes than text analysis. This package provides the Python module and the command-line tools: pdf2txt and dumppdf.