How to make the python interpreter correctly handle non-ASCII characters in string operations?

Throw out all characters that can’t be interpreted as ASCII:

def remove_non_ascii(s):
    return "".join(c for c in s if ord(c)<128)

Keep in mind that this is guaranteed to work with the UTF-8 encoding (because all bytes in multi-byte characters have the highest bit set to 1).

More Related Contents:

What exactly do “u” and “r” string flags do, and what are raw string literals?
What does the ‘b’ character do in front of a string literal?
How to remove \xa0 from string in Python?
How to get string objects instead of Unicode from JSON?
Python string prints as [u’String’]
Why does Python print unicode characters when the default encoding is ASCII?
Python str vs unicode types
Non-ASCII characters in Matplotlib
How do I check if a string is unicode or ascii?
Conversion of strings like \\uXXXX in python
How to read Unicode input and compare Unicode strings in Python?
Python 3 print() function with Farsi/Arabic characters [duplicate]
Python – Unicode to ASCII conversion
Python DictWriter writing UTF-8 encoded CSV files
Python not sorting unicode properly. Strcoll doesn’t help
python-re: How do I match an alpha character
What does ‘u’ mean in a list?
Saving UTF-8 texts with json.dumps as UTF-8, not as a \u escape sequence
Error “(unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated \UXXXXXXXX escape” [duplicate]
How is unicode represented internally in Python?
Unicode identifiers in Python?
Printing a string prints ‘u’ before the string in Python?
Matching only a unicode letter in Python re
How does unicodedata.normalize(form, unistr) work?
Python – dealing with mixed-encoding files
Python: How can I replace full-width characters with half-width characters?
Python the same char not equals
SQLite, python, unicode, and non-utf data
Remove non-ASCII characters from a string using python / django
Python Unicode string stored as ‘\u84b8\u6c7d\u5730’ in file, how to convert it back to Unicode?

More Related Contents:

Leave a Comment Cancel reply