Python re.sub back reference not back referencing [duplicate]

You need to use a raw-string here so that the backslash isn’t processed as an escape character:

>>> import re
>>> fileText="<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>"
>>> fileText = re.sub("<b>(.*?)</b>", r"\1", fileText, flags=re.DOTALL)
>>> fileText
'<text top="52" left="20" width="383" height="15" font="0">test</text>'
>>>

Notice how "\1" was changed to r"\1". Though it is a very small change (one character), it has a big effect. See below:

>>> "\1"
'\x01'
>>> r"\1"
'\\1'
>>>

Leave a Comment