Converting UTF-8 to ISO-8859-1 in Java – how to keep it as single byte

If you’re dealing with character encodings other than UTF-16, you shouldn’t be using java.lang.String or the char primitive — you should only be using byte[] arrays or ByteBuffer objects. Then, you can use java.nio.charset.Charset to convert between encodings: Charset utf8charset = Charset.forName(“UTF-8”); Charset iso88591charset = Charset.forName(“ISO-8859-1”); ByteBuffer inputBuffer = ByteBuffer.wrap(new byte[]{(byte)0xC3, (byte)0xA2}); // decode UTF-8 … Read more

How do I convert between ISO-8859-1 and UTF-8 in Java?

In general, you can’t do this. UTF-8 is capable of encoding any Unicode code point. ISO-8859-1 can handle only a tiny fraction of them. So, transcoding from ISO-8859-1 to UTF-8 is no problem. Going backwards from UTF-8 to ISO-8859-1 will cause “replacement characters” (�) to appear in your text when unsupported characters are found. To … Read more

HTML encoding issues – “” character showing up instead of ” “

Somewhere in that mess, the non-breaking spaces from the HTML template (the  s) are encoding as ISO-8859-1 so that they show up incorrectly as an “” character That’d be encoding to UTF-8 then, not ISO-8859-1. The non-breaking space character is byte 0xA0 in ISO-8859-1; when encoded to UTF-8 it’d be 0xC2,0xA0, which, if you (incorrectly) … Read more