How to make Django slugify work properly with Unicode strings?

There is a python package called unidecode that I’ve adopted for the askbot Q&A forum, it works well for the latin-based alphabets and even looks reasonable for greek:

>>> import unidecode
>>> from unidecode import unidecode
>>> unidecode(u'διακριτικός')
'diakritikos'

It does something weird with asian languages:

>>> unidecode(u'影師嗎')
'Ying Shi Ma '
>>> 

Does this make sense?

In askbot we compute slugs like so:

from unidecode import unidecode
from django.template import defaultfilters
slug = defaultfilters.slugify(unidecode(input_text))

Leave a Comment