“SyntaxError: Non-ASCII character …” or “SyntaxError: Non-UTF-8 code starting with …” trying to use non-ASCII text in a Python script

I’d recommend reading that PEP the error gives you. The problem is that your code is trying to use the ASCII encoding, but the pound symbol is not an ASCII character. Try using UTF-8 encoding. You can start by putting # -*- coding: utf-8 -*- at the top of your .py file. To get more … Read more

Why does ENcoding a string result in a DEcoding error (UnicodeDecodeError)?

“你好”.encode(‘utf-8′) encode converts a unicode object to a string object. But here you have invoked it on a string object (because you don’t have the u). So python has to convert the string to a unicode object first. So it does the equivalent of “你好”.decode().encode(‘utf-8′) But the decode fails because the string isn’t valid ascii. … Read more

UnicodeDecodeError: (‘utf-8’ codec) while reading a csv file [duplicate]

Known encoding If you know the encoding of the file you want to read in, you can use pd.read_csv(‘filename.txt’, encoding=’encoding’) These are the possible encodings: https://docs.python.org/3/library/codecs.html#standard-encodings Unknown encoding If you do not know the encoding, you can try to use chardet, however this is not guaranteed to work. It is more a guess work. import … Read more

Removing unicode \u2026 like characters in a string in python2.7 [duplicate]

Python 2.x >>> s ‘This is some \\u03c0 text that has to be cleaned\\u2026! it\\u0027s annoying!’ >>> print(s.decode(‘unicode_escape’).encode(‘ascii’,’ignore’)) This is some text that has to be cleaned! it’s annoying! Python 3.x >>> s=”This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying!” >>> s.encode(‘ascii’, ‘ignore’) b”This is some text that has to be … Read more

Python – ‘ascii’ codec can’t decode byte

“你好”.encode(‘utf-8′) encode converts a unicode object to a string object. But here you have invoked it on a string object (because you don’t have the u). So python has to convert the string to a unicode object first. So it does the equivalent of “你好”.decode().encode(‘utf-8′) But the decode fails because the string isn’t valid ascii. … Read more

How to print Unicode character in Python?

To include Unicode characters in your Python source code, you can use Unicode escape characters in the form \u0123 in your string. In Python 2.x, you also need to prefix the string literal with ‘u’. Here’s an example running in the Python 2.x interactive console: >>> print u’\u0420\u043e\u0441\u0441\u0438\u044f’ Россия In Python 2, prefixing a string … Read more