html2xhtml

parse HTML reliably

Install

All systems
curl cmd.cat/html2xhtml.sh
Debian Debian
apt-get install libhtml-html5-parser-perl
Ubuntu
apt-get install libhtml-html5-parser-perl
Fedora
dnf install perl-HTML-HTML5-Parser
Windows (WSL2)
sudo apt-get update sudo apt-get install libhtml-html5-parser-perl
Raspbian
apt-get install libhtml-html5-parser-perl

libhtml-html5-parser-perl

parse HTML reliably

HTML::HTML5::Parser is an HTML parser, similar to the non-CPAN module Whatpm::HTML with some changes including: * Provides an XML::LibXML-like DOM interface. If you usually use XML::LibXML's DOM parser, this should be a drop-in solution for tag soup HTML. * Constructs an XML::LibXML::Document as the result of parsing. * Via bundling and modifications, removed external dependencies on non-CPAN packages.

perl-HTML-HTML5-Parser

Parse HTML reliably