What is the character encoding of String in Java?

  1. Java stores strings as UTF-16 internally.

  2. “default encoding” isn’t quite right. Java stores strings as UTF-16 internally, but the encoding used externally, the “system default encoding”, varies from platform to platform, and can even be altered by things like environment variables on some platforms.

    ASCII is a subset of Latin 1 which is a subset of Unicode. UTF-16 is a way of encoding Unicode. So if you perform your int i = 'x' test for any character that falls in the ASCII range you’ll get the ASCII value. UTF-16 can represent a lot more characters than ASCII, however.

  3. From the java.lang.Character docs:

    The Java 2 platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes.

    So it’s defined as part of the Java 2 platform that UTF-16 is used for these classes.

Leave a Comment