how to show non ASCII character using python [closed]

gs.translate('this is a pen','bn')

produces a Unicode string. If you just type gs.translate('this is a pen','bn') into the interactive interpreter it prints the representation of that string which is

u'\u098f\u0987 \u098f\u0995\u099f\u09bf \u0995\u09b2\u09ae'.

But when you type print(gs.translate('this is a pen','bn')) the Unicode data is encoded into a stream of bytes using the default encoding (which appears to be utf-8) so that the data can be printed.

You can perform that encoding explicitly:

uni = u'\u098f\u0987 \u098f\u0995\u099f\u09bf \u0995\u09b2\u09ae'
s = uni.encode('utf-8')
print(s)

output

এই একটি কলম

Note that the representation of s is the following byte string:

'\xe0\xa6\x8f\xe0\xa6\x87 \xe0\xa6\x8f\xe0\xa6\x95\xe0\xa6\x9f\xe0\xa6\xbf \xe0\xa6\x95\xe0\xa6\xb2\xe0\xa6\xae'

so that’s what would get printed in the interactive interpreter if you typed s at the prompt.

You can’t get the interpreter to print এই একটি কলম simply be typing a variable name or simple expression, since it will always show the representation of the variable or expression. So if you want to see the actual Bengali (?) text in the interactive interpreter, you need to use print (or sys.stdout.write) to tell it to print the UTF-8 encoded data.

Leave a Comment