A regex for version number parsing

I’d express the format as:

“1-3 dot-separated components, each numeric except that the last one may be *”

As a regexp, that’s:

^(\d+\.)?(\d+\.)?(\*|\d+)$

[Edit to add: this solution is a concise way to validate, but it has been pointed out that extracting the values requires extra work. It’s a matter of taste whether to deal with this by complicating the regexp, or by processing the matched groups.

In my solution, the groups capture the "." characters. This can be dealt with using non-capturing groups as in ajborley’s answer.

Also, the rightmost group will capture the last component, even if there are fewer than three components, and so for example a two-component input results in the first and last groups capturing and the middle one undefined. I think this can be dealt with by non-greedy groups where supported.

Perl code to deal with both issues after the regexp could be something like this:

@version = ();
@groups = ($1, $2, $3);
foreach (@groups) {
    next if !defined;
    s/\.//;
    push @version, $_;
}
($major, $minor, $mod) = (@version, "*", "*");

Which isn’t really any shorter than splitting on "."
]

Leave a Comment