All of these answers are now wrong, because as of PHP 5.4 and Libxml 2.6 loadHTML
now has a $option
parameter which instructs Libxml about how it should parse the content.
Therefore, if we load the HTML with these options
$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
when doing saveHTML()
there will be no doctype
, no <html>
, and no <body>
.
LIBXML_HTML_NOIMPLIED
turns off the automatic adding of implied html/body elements
LIBXML_HTML_NODEFDTD
prevents a default doctype being added when one is not found.
Full documentation about Libxml parameters is here
(Note that loadHTML
docs say that Libxml 2.6 is needed, but LIBXML_HTML_NODEFDTD
is only available in Libxml 2.7.8 and LIBXML_HTML_NOIMPLIED
is available in Libxml 2.7.7)