How to read RegEx Captures in C#

The C# regex API can be quite confusing. There are groups and captures:

  • A group represents a capturing group, it’s used to extract a substring from the text
  • There can be several captures per group, if the group appears inside a quantifier.

The hierarchy is:

  • Match
    • Group
      • Capture

(a match can have several groups, and each group can have several captures)

For example:

Subject: aabcabbc
Pattern: ^(?:(a+b+)c)+$

In this example, there is only one group: (a+b+). This group is inside a quantifier, and is matched twice. It generates two captures: aab and abb:

aabcabbc
^^^ ^^^
Cap1  Cap2

When a group is not inside of a quantifier, it generates only one capture. In your case, you have 3 groups, and each group captures once. You can use match.Groups[1].Value, match.Groups[2].Value and match.Groups[3].Value to extract the 3 substrings you’re interested in, without resorting to the capture notion at all.

Leave a Comment