Setting the correct encoding when piping stdout in Python

Your code works when run in an script because Python encodes the output to whatever encoding your terminal application is using. If you are piping you must encode it yourself. A rule of thumb is: Always use Unicode internally. Decode what you receive, and encode what you send. # -*- coding: utf-8 -*- print u”åäö”.encode(‘utf-8’) … Read more

What is the best way to remove accents (normalize) in a Python unicode string?

Unidecode is the correct answer for this. It transliterates any unicode string into the closest possible representation in ascii text. Example: accented_string = u’Málaga’ # accented_string is of type ‘unicode’ import unidecode unaccented_string = unidecode.unidecode(accented_string) # unaccented_string contains ‘Malaga’and is of type ‘str’

How does Python 2 compare string and int? Why do lists compare as greater than numbers, and tuples greater than lists?

From the python 2 manual: CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address. When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for … Read more