How do I use a regex in a shell script?

To complement the existing helpful answers:

Using Bash’s own regex-matching operator, =~, is a faster alternative in this case, given that you’re only matching a single value already stored in a variable:

set -- '12-34-5678' # set $1 to sample value

kREGEX_DATE='^[0-9]{2}[-/][0-9]{2}[-/][0-9]{4}$' # note use of [0-9] to avoid \d
[[ $1 =~ $kREGEX_DATE ]]
echo $? # 0 with the sample value, i.e., a successful match

Note, however, that the caveat re using flavor-specific regex constructs such as \d equally applies:
While =~ supports EREs (extended regular expressions), it also supports the host platform’s specific extension – it’s a rare case of Bash’s behavior being platform-dependent.

To remain portable (in the context of Bash), stick to the POSIX ERE specification.

Note that =~ even allows you to define capture groups (parenthesized subexpressions) whose matches you can later access through Bash’s special ${BASH_REMATCH[@]} array variable.

Further notes:

$kREGEX_DATE is used unquoted, which is necessary for the regex to be recognized as such (quoted parts would be treated as literals).
While not always necessary, it is advisable to store the regex in a variable first, because Bash has trouble with regex literals containing \.
- E.g., on Linux, where \< is supported to match word boundaries, [[ 3 =~ \<3 ]] && echo yes doesn’t work, but re="\<3"; [[ 3 =~ $re ]] && echo yes does.
I’ve changed variable name REGEX_DATE to kREGEX_DATE (k signaling a (conceptual) constant), so as to ensure that the name isn’t an all-uppercase name, because all-uppercase variable names should be avoided to prevent conflicts with special environment and shell variables.

More Related Contents:

Leave a Comment Cancel reply