UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xd1 in position 2: ordinal not in range(128)

Unicode is not equal to UTF-8. The latter is just an encoding for the former.

You are doing it the wrong way around. You are reading UTF-8-encoded data, so you have to decode the UTF-8-encoded String into a unicode string.

So just replace .encode with .decode, and it should work (if your .csv is UTF-8-encoded).

Nothing to be ashamed of, though. I bet 3 in 5 programmers had trouble at first understanding this, if not more 😉

Update:
If your input data is not UTF-8 encoded, then you have to .decode() with the appropriate encoding, of course. If nothing is given, python assumes ASCII, which obviously fails on non-ASCII-characters.

Leave a Comment