How is unicode represented internally in Python?
I’m assuming you want to know about CPython, the standard implementation. Python 2 and Python 3.0-3.2 use either UCS2* or UCS4 for Unicode characters, meaning it’ll either use 2 bytes or 4 bytes for each character. Which one is picked is a compile-time option. \u2049 is then represented as either \x49\x20 or \x20\x49 or \x49\x20\x00\x00 … Read more