How do I treat an ASCII string as unicode and unescape the escaped characters in it in python?

It took me a while to figure this one out, but this page had the best answer:

>>> s="\u003cfoo/\u003e"
>>> s.decode( 'unicode-escape' )
u'<foo/>'
>>> s.decode( 'unicode-escape' ).encode( 'ascii' )
'<foo/>'

There’s also a ‘raw-unicode-escape’ codec to handle the other way to specify Unicode strings — check the “Unicode Constructors” section of the linked page for more details (since I’m not that Unicode-saavy).

EDIT: See also Python Standard Encodings.

Leave a Comment