regex match keywords that are not in quotes

Here is one answer:

(?<=^([^"]|"[^"]*")*)text

This means:

(?<=       # preceded by...
^          # the start of the string, then
([^"]      # either not a quote character
|"[^"]*"   # or a full string
)*         # as many times as you want
)
text       # then the text

You can easily extend this to handle strings containing escapes as well.

In C# code:

Regex.Match("bla bla bla \"this text is inside a string\"",
            "(?<=^([^\"]|\"[^\"]*\")*)text", RegexOptions.ExplicitCapture);

Added from comment discussion – extended version (match on a per-line basis and handle escapes). Use RegexOptions.Multiline for this:

(?<=^([^"\r\n]|"([^"\\\r\n]|\\.)*")*)text

In a C# string this looks like:

"(?<=^([^\"\r\n]|\"([^\"\\\\\r\n]|\\\\.)*\")*)text"

Since you now want to use ** instead of " here is a version for that:

(?<=^([^*\r\n]|\*(?!\*)|\*\*([^*\\\r\n]|\\.|\*(?!\*))*\*\*)*)text

Explanation:

(?<=       # preceded by
^          # start of line
 (         # either
 [^*\r\n]| #  not a star or line break
 \*(?!\*)| #  or a single star (star not followed by another star)
  \*\*     #  or 2 stars, followed by...
   ([^*\\\r\n] # either: not a star or a backslash or a linebreak
   |\\.        # or an escaped char
   |\*(?!\*)   # or a single star
   )*          # as many times as you want
  \*\*     # ended with 2 stars
 )*        # as many times as you want
)
text      # then the text

Since this version doesn’t contain " characters it’s cleaner to use a literal string:

@"(?<=^([^*\r\n]|\*(?!\*)|\*\*([^*\\\r\n]|\\.|\*(?!\*))*\*\*)*)text"

Leave a Comment