Unescape HTML entities in JavaScript?

Most answers given here have a huge disadvantage: if the string you are trying to convert isn’t trusted then you will end up with a Cross-Site Scripting (XSS) vulnerability. For the function in the accepted answer, consider the following:

htmlDecode("<img src="https://stackoverflow.com/questions/3700326/dummy" onerror="alert(/xss/)">");

The string here contains an unescaped HTML tag, so instead of decoding anything the htmlDecode function will actually run JavaScript code specified inside the string.

This can be avoided by using DOMParser which is supported in all modern browsers:

function htmlDecode(input) {
  var doc = new DOMParser().parseFromString(input, "text/html");
  return doc.documentElement.textContent;
}

console.log(  htmlDecode("&lt;img src="https://stackoverflow.com/questions/3700326/myimage.jpg"&gt;")  )    
// "<img src="https://stackoverflow.com/questions/3700326/myimage.jpg">"

console.log(  htmlDecode("<img src="https://stackoverflow.com/questions/3700326/dummy" onerror="alert(/xss/)">")  )  
// ""

This function is guaranteed to not run any JavaScript code as a side-effect. Any HTML tags will be ignored, only text content will be returned.

Compatibility note: Parsing HTML with DOMParser requires at least Chrome 30, Firefox 12, Opera 17, Internet Explorer 10, Safari 7.1 or Microsoft Edge. So all browsers without support are way past their EOL and as of 2017 the only ones that can still be seen in the wild occasionally are older Internet Explorer and Safari versions (usually these still aren’t numerous enough to bother).

Leave a Comment