html2xhtml
parse HTML reliably
Install
- All systems
-
curl cmd.cat/html2xhtml.sh
- Debian
-
apt-get install libhtml-html5-parser-perl
- Ubuntu
-
apt-get install libhtml-html5-parser-perl
- Fedora
-
dnf install perl-HTML-HTML5-Parser
- Windows (WSL2)
-
sudo apt-get update
sudo apt-get install libhtml-html5-parser-perl
- Raspbian
-
apt-get install libhtml-html5-parser-perl
- Dockerfile
- dockerfile.run/html2xhtml
libhtml-html5-parser-perl
parse HTML reliably
HTML::HTML5::Parser is an HTML parser, similar to the non-CPAN module Whatpm::HTML with some changes including: * Provides an XML::LibXML-like DOM interface. If you usually use XML::LibXML's DOM parser, this should be a drop-in solution for tag soup HTML. * Constructs an XML::LibXML::Document as the result of parsing. * Via bundling and modifications, removed external dependencies on non-CPAN packages.