What does the “r” in pythons re.compile(r’ pattern flags’) mean?

As @PauloBu stated, the r string prefix is not specifically related to regex’s, but to strings generally in Python.

Normal strings use the backslash character as an escape character for special characters (like newlines):

>>> print('this is \n a test')
this is 
 a test

The r prefix tells the interpreter not to do this:

>>> print(r'this is \n a test')
this is \n a test
>>> 

This is important in regular expressions, as you need the backslash to make it to the re module intact – in particular, \b matches empty string specifically at the start and end of a word. re expects the string \b, however normal string interpretation '\b' is converted to the ASCII backspace character, so you need to either explicitly escape the backslash ('\\b'), or tell python it is a raw string (r'\b').

>>> import re
>>> re.findall('\b', 'test') # the backslash gets consumed by the python string interpreter
[]
>>> re.findall('\\b', 'test') # backslash is explicitly escaped and is passed through to re module
['', '']
>>> re.findall(r'\b', 'test') # often this syntax is easier
['', '']

Leave a Comment