Why does String#gsub double content?

You’re getting tripped up by the specialness of \' inside a regular expression replacement string:

\0, \1, \2, … \9, \&, \`, \', \+
Substitutes the value matched by the nth grouped subexpression, or by the entire match, pre- or postmatch, or the highest group.

So when you say "\\'", the double \\ becomes just a single backslash and the result is \' but that means “The string to the right of the last successful match.” If you want to replace single quotes with escaped single quotes, you need to escape more to get past the specialness of \':

s.gsub("'", "\\\\'")

Or avoid the toothpicks and use the block form:

s.gsub("'") { |m| '\\' + m }

You would run into similar issues if you were trying to escape backticks, a plus sign, or even a single digit.

The overall lesson here is to prefer the block form of gsub for anything but the most trivial of substitutions.

Leave a Comment