Convert HTML to data:text/html link using JavaScript

Characteristics of a data-URI

A data-URI with MIME-type text/html has to be in one of these formats:

data:text/html,<HTML HERE>
data:text/html;charset=UTF-8,<HTML HERE>

Base-64 encoding is not necessary. If your code contains non-ASCII characters, such as éé, charset=UTF-8 has to be added.

The following characters have to be escaped:

  • # – Firefox and Opera interpret this character as the marker of a hash (as in location.hash).
  • % – This character is used to escape characters. Escape this character to make sure that no side effects occur.

Additionally, if you want to embed the code in an anchor tag, the following characters should also be escaped:

  • " and/or ' – Quotes mark the value of the attribute.
  • & – The ampersand is used to mark HTML entities.
  • < and > do not have to be escaped inside a HTML attribute. However, if you’re going to embed the link in the HTML, these should also be escaped (%3C and %3E)

JavaScript implementation

If you don’t mind the size of the data-URI, the easiest method to do so is using encodeURIComponent:

var html = document.getElementById("html").innerHTML;
var dataURI = 'data:text/html,' + encodeURIComponent(html);

If size matters, you’d better strip out all consecutive white-space (this can safely be done, unless the HTML contains a <pre> element/style). Then, only replace the significant characters:

var html = document.getElementById("html").innerHTML;
html = html.replace(/\s{2,}/g, '')   // <-- Replace all consecutive spaces, 2+
           .replace(/%/g, '%25')     // <-- Escape %
           .replace(/&/g, '%26')     // <-- Escape &
           .replace(/#/g, '%23')     // <-- Escape #
           .replace(/"/g, '%22')     // <-- Escape "
           .replace(/'/g, '%27');    // <-- Escape ' (to be 100% safe)
var dataURI = 'data:text/html;charset=UTF-8,' + html;

Leave a Comment