DOMDocument::loadHTML error

Header, Nav and Section are elements from HTML5. Because HTML5 developers felt it is too difficult to remember Public and System Identifiers, the DocType declaration is just: <!DOCTYPE html> In other words, there is no DTD to check, which will make DOM use the HTML4 Transitional DTD and that doesnt contain those elements, hence the … Read more

loadHTML LIBXML_HTML_NOIMPLIED on an html fragment generates incorrect tags

The re-arrangement is done by the LIBXML_HTML_NOIMPLIED option you’re using. Looks like it’s not stable enough for your case. Also you might want to not use it for portablility reasons, for example I’ve got one PHP 5.4.36 with Libxml 2.7.8 at hand that is not supporting LIBXML_HTML_NOIMPLIED (Libxml >= 2.7.7) but later LIBXML_HTML_NODEFDTD (Libxml >= … Read more

Debug a DOMDocument Object in PHP

This answer is a little late probably, but I liked your question! PHP has nothing build-in directly to solve your problem, so there is not XML dump or something. However, PHP has the RecursiveTreeIteratorĀ­Docs that comes pretty close to your output: \-<html> \-<body> \-<p> \-Hello World (it will look better if your X(HT)ML structure looks … Read more

How to remove an HTML element using the DOMDocument class

In addition to Dave Morgan’s answer you can use DOMNode::removeChild to remove child from list of children: Removing a child by tag name //The following example will delete the table element of an HTML content. $dom = new DOMDocument(); //avoid the whitespace after removing the node $dom->preserveWhiteSpace = false; //parse html dom elements $dom->loadHTML($html_contents); //get … Read more

Detect Document Height Change

Update (Oct 2020): resizeObserver is a wonderful API (support table) // create an Observer instance const resizeObserver = new ResizeObserver(entries => console.log(‘Body height changed:’, entries[0].target.clientHeight) ) // start observing a DOM node resizeObserver.observe(document.body) // click anywhere to rnadomize height window.addEventListener(‘click’, () => document.body.style.height = Math.floor((Math.random() * 5000) + 1) + ‘px’ ) click anywhere to … Read more

DOMDocument PHP Memory Leak

Using libxml_use_internal_errors(true); suppresses error output but builds a continuous log of errors which is appended to on each loop. Either disable the internal logging and suppress PHP warnings, or clear the internal log on each loop iteration like this: <?php libxml_use_internal_errors(true); while(true){ $dom = new DOMDocument(); $dom->loadHTML(file_get_contents(‘ebay.html’)); unset($dom); libxml_use_internal_errors(false); libxml_use_internal_errors(true); echo memory_get_peak_usage(true) . “\r\n”; flush(); … Read more