Removing all script tags from html with JS Regular Expression

jQuery uses a regex to remove script tags in some cases and I’m pretty sure its devs had a damn good reason to do so. Probably some browser does execute scripts when inserting them using innerHTML.

Here’s the regex:

/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi

And before people start crying “but regexes for HTML are evil”: Yes, they are – but for script tags they are safe because of the special behaviour – a <script> section may not contain </script> at all unless it should end at this position. So matching it with a regex is easily possible. However, from a quick look the regex above does not account for trailing whitespace inside the closing tag so you’d have to test if </script    etc. will still work.

Leave a Comment