How does unicodedata.normalize(form, unistr) work?
I find the documentation pretty clear, but here are a few code examples: from unicodedata import normalize print ‘%r’ % normalize(‘NFD’, u’\u00C7′) # decompose: convert Ç to “C + ̧” print ‘%r’ % normalize(‘NFC’, u’C\u0327′) # compose: convert “C + ̧” to Ç Both ‘D’ (=decompose) forms convert a single combined character (like ä) into … Read more