Like most regex flavors, java.util.regex.Pattern
has its own specific features with syntax that may not be fully compatible with others; this includes character class union, intersection and subtraction:
[a-d[m-p]]
:a
throughd
, orm
throughp
:[a-dm-p]
(union)[a-z&&[def]]
:d
,e
, orf
(intersection)[a-z&&[^bc]]
:a
throughz
, except forb
andc
:[ad-z]
(subtraction)
The most important “caveat” of Java regex is that matches
attempts to match a pattern against the whole string. This is atypical of most engines, and can be a source of confusion at times.
See also
On character class subtraction
Subtraction allows you to define for example “all consonants” in Java as [a-z&&[^aeiou]]
.
This syntax is specific to Java. In XML Schema, .NET, JGSoft and RegexBuddy, it’s [a-z-[aeiou]]
. Other flavors may not support this feature at all.
References
- regular-expressions.info/Character Classes in XML Regular Expressions
- MSDN – Regular Expression Character Classes – Subtraction