How does the regular expression ‘(?

They are called lookarounds; they allow you to assert if a pattern matches or not, without actually making the match. There are 4 basic lookarounds:

Positive lookarounds: see if we CAN match the pattern…
- (?=pattern) – … to the right of current position (look ahead)
- (?<=pattern) – … to the left of current position (look behind)
Negative lookarounds – see if we can NOT match the pattern
- (?!pattern) – … to the right
- (?<!pattern) – … to the left

As an easy reminder, for a lookaround:

= is positive, ! is negative
< is look behind, otherwise it’s look ahead

References

regular-expressions.info/Lookarounds

But why use lookarounds?

One might argue that lookarounds in the pattern above aren’t necessary, and #([^#]+)# will do the job just fine (extracting the string captured by \1 to get the non-#).

Not quite. The difference is that since a lookaround doesn’t match the #, it can be “used” again by the next attempt to find a match. Simplistically speaking, lookarounds allow “matches” to overlap.

Consider the following input string:

and #one# and #two# and #three#four#

Now, #([a-z]+)# will give the following matches (as seen on rubular.com):

and #one# and #two# and #three#four#
    \___/     \___/     \_____/

Compare this with (?<=#)[a-z]+(?=#), which matches:

and #one# and #two# and #three#four#
     \_/       \_/       \___/ \__/

Unfortunately this can’t be demonstrated on rubular.com, since it doesn’t support lookbehind. However, it does support lookahead, so we can do something similar with #([a-z]+)(?=#), which matches (as seen on rubular.com):

and #one# and #two# and #three#four#
    \__/      \__/      \____/\___/

References

regular-expressions.info/Flavor Comparison

References

But why use lookarounds?

References

More Related Contents:

Leave a Comment Cancel reply