“Unmappable character for encoding UTF-8” error

You have encoding problem with your sourcecode file. It is maybe ISO-8859-1 encoded, but the compiler was set to use UTF-8. This will results in errors when using characters, which will not have the same bytes representation in UTF-8 and ISO-8859-1. This will happen to all characters which are not part of ASCII, for example ¬ NOT SIGN.

You can simulate this with the following program. It just uses your line of source code and generates a ISO-8859-1 byte array and decode this “wrong” with UTF-8 encoding. You can see at which position the line gets corrupted. I added 2 spaces at your source code to fit position 74 to fit this to ¬ NOT SIGN, which is the only character, which will generate different bytes in ISO-8859-1 encoding and UTF-8 encoding. I guess this will match indentation with the real source file.

 String reg = "      String reg = \"^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!\"'%*=¬.,-])(?=[^\\s]+$).{8,24}$\";";
 String corrupt=new String(reg.getBytes("ISO-8859-1"),"UTF-8");
 System.out.println(corrupt+": "+corrupt.charAt(74));
 System.out.println(reg+": "+reg.charAt(74));     

which results in the following output (messed up because of markup):

String reg = “^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!”‘%*=�.,-])(?=[^\s]+$).{8,24}$”;: �

String reg = “^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!”‘%*=¬.,-])(?=[^\s]+$).{8,24}$”;: ¬

See “live” at https://ideone.com/ShZnB

To fix this, save the source files with UTF-8 encoding.

Leave a Comment