Problems converting byte array to string and back to byte array

It is not a good idea to store encrypted data in Strings because they are for human-readable text, not for arbitrary binary data. For binary data it’s best to use byte[].

However, if you must do it you should use an encoding that has a 1-to-1 mapping between bytes and characters, that is, where every byte sequence can be mapped to a unique sequence of characters, and back. One such encoding is ISO-8859-1, that is:

    String decoded = new String(encryptedByteArray, "ISO-8859-1");
    System.out.println("decoded:" + decoded);

    byte[] encoded = decoded.getBytes("ISO-8859-1"); 
    System.out.println("encoded:" + java.util.Arrays.toString(encoded));

    String decryptedText = encrypter.decrypt(encoded);

Other common encodings that don’t lose data are hexadecimal and base64, but sadly you need a helper library for them. The standard API doesn’t define classes for them.

With UTF-16 the program would fail for two reasons:

  1. String.getBytes(“UTF-16”) adds a byte-order-marker character to the output to identify the order of the bytes. You should use UTF-16LE or UTF-16BE for this to not happen.
  2. Not all sequences of bytes can be mapped to characters in UTF-16. First, text encoded in UTF-16 must have an even number of bytes. Second, UTF-16 has a mechanism for encoding unicode characters beyond U+FFFF. This means that e.g. there are sequences of 4 bytes that map to only one unicode character. For this to be possible the first 2 bytes of the 4 don’t encode any character in UTF-16.

Leave a Comment