pdftohtml

Convert PDF files into HTML, XML and PNG images. More information: <https://manned.org/pdftohtml>.

Install

All systems
curl cmd.cat/pdftohtml.sh
Debian Debian
apt-get install poppler-utils
Ubuntu
apt-get install poppler-utils
Alpine
apk add poppler
Arch Arch Linux
pacman -S poppler
image/svg+xml Kali Linux
apt-get install poppler-utils
CentOS
yum install poppler-utils
Fedora
dnf install poppler-utils
Windows (WSL2)
sudo apt-get update sudo apt-get install poppler-utils
OS X
brew install pdftohtml
Raspbian
apt-get install poppler-utils
Docker
docker run cmd.cat/pdftohtml pdftohtml powered by Commando

Convert PDF files into HTML, XML and PNG images. More information: <https://manned.org/pdftohtml>.

  • Convert a PDF file to an HTML file:
    pdftohtml path/to/file.pdf path/to/output_file.html
  • Ignore images in the PDF file:
    pdftohtml -i path/to/file.pdf path/to/output_file.html
  • Generate a single HTML file that includes all PDF pages:
    pdftohtml -s path/to/file.pdf path/to/output_file.html
  • Convert a PDF file to an XML file:
    pdftohtml -xml path/to/file.pdf path/to/output_file.xml

© tl;dr; authors and contributors