What literal characters should be escaped in a regex?

In many regex implementations, the following rules apply:

Meta characters inside a character class are:

  • ^ (negation)
  • - (range)
  • ] (end of the class)
  • \ (escape char)

So these should all be escaped. There are some corner cases though:

  • - needs no escaping if placed at the very start, or end of the class ([abc-] or [-abc]). In quite a few regex implementations, it also needs no escaping when placed directly after a range ([a-c-abc]) or short-hand character class ([\w-abc]). This is what you observed
  • ^ needs no escaping when it’s not at the start of the class: [^a] means any char except a, and [a^] matches either a or ^, which equals: [\^a]
  • ] needs no escaping if it’s the only character in the class: []] matches the char ]

Leave a Comment