Regex for comments in strings, strings in comments, etc

I’ve broken the regex into 4 lines corresponding with the 4 paths in the graph, don’t keep those line-breaks in there if you ever use this.

(['"])(?:(?!\1|\\).|\\.)*\1|
\/(?![*/])(?:[^\\/]|\\.)+\/[igm]*|
\/\/[^\n]*(?:\n|$)|
\/\*(?:[^*]|\*(?!\/))*\*\/

Regular expression visualization

Debuggex Demo

This code grabs 4 types of “blocks” that can contain the other 3. You can iterate through this and do with each one whatever you want or discard it because it’s not the one you wanna do anything to.

This one is specific for JavaScript as it’s a language I’m familiar with. But you could easily adapt this to the language of your preference.

Anyone see a way in which this code breaks?

Edit I have since been notified that the general pattern is described very well here: https://stackoverflow.com/a/23589204/2684660, neato!

Leave a Comment