>>> u'aあä'.encode('ascii', 'ignore')
'a'
Decode the string you get back, using either the charset in the the appropriate meta
tag in the response or in the Content-Type
header, then encode.
The method encode(encoding, errors)
accepts custom handlers for errors. The default values, besides ignore
, are:
>>> u'aあä'.encode('ascii', 'replace')
b'a??'
>>> u'aあä'.encode('ascii', 'xmlcharrefreplace')
b'aあä'
>>> u'aあä'.encode('ascii', 'backslashreplace')
b'a\\u3042\\xe4'
See https://docs.python.org/3/library/stdtypes.html#str.encode