preg_match(): Compilation failed: invalid range in character class at offset

The problem is really old, but there are some new developments related to PHP 7.3 and newer versions that need to be covered. PHP PCRE engine migrates to PCRE2, and the PCRElibrary version used in PHP 7.3 is 10.32, and that is where Backward Incompatible Changes originate from:

  • Internal library API has changed
  • The ‘S’ modifier has no effect, patterns are studied automatically. No real impact.
  • The ‘X’ modifier is the default behavior in PCRE2. The current patch reverts the behavior to the meaning of ‘X’ how it was in PCRE, but it might be better to go with the new behavior and have ‘X’ turned on by default. So currently no impact, too.
  • Some behavior change due to the newer Unicode engine was sighted. It’s Unicode 10 in PCRE2 vs Unicode 7 in PCRE. Some behavior change can be sighted with invalid patterns.

Acc. to the PHP 10.33 changelog:

  1. With PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL set, escape sequences such as \s
    which are valid in character classes, but not as the end of ranges, were being
    treated as literals. An example is [_-\s] (but not [\s-_] because that gave an
    error at the start of a range). Now an “invalid range” error is given
    independently of PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.

Before PHP 7.3, you might use the hyphen in a character class in any position if you escaped it, or if you put it “in a position where it cannot be interpreted as indicating a range”. In PHP 7.3, it seems the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL was set to false. So, from now on, in order to put hyphen into a character class, always use it either at the start or end positions only.

See also this reference:

In simple words,

PCRE2 is more strict in the pattern validations, so after the upgrade, some of your existing patterns could not compile anymore.

Here is the simple snippet used in php.net

preg_match('/[\w-.]+/', ''); // this will not work in PHP7.3
preg_match('/[\w\-.]+/', ''); // the hyphen need to be escaped

As you can see from the example above there is a little but substantial difference between the two lines.

Leave a Comment