Difference between compact strings and compressed strings in Java 9

Compressed strings (Java 6) and compact strings (Java 9) both have the same motivation (strings are often effectively Latin-1, so half the space is wasted) and goal (make those strings small) but the implementations differ a lot.

Compressed Strings

In an interview Aleksey Shipilëv (who was in charge of implementing the Java 9 feature) had this to say about compressed strings:

UseCompressedStrings feature was rather conservative: while distinguishing between char[] and byte[] case, and trying to compress the char[] into byte[] on String construction, it done most String operations on char[], which required to unpack the String. Therefore, it benefited only a special type of workloads, where most strings are compressible (so compression does not go to waste), and only a limited amount of known String operations are performed on them (so no unpacking is needed). In great many workloads, enabling -XX:+UseCompressedStrings was a pessimization.

[…] UseCompressedStrings implementation was basically an optional feature that maintained a completely distinct String implementation in alt-rt.jar, which was loaded once the VM option is supplied. Optional features are harder to test, since they double the number of option combinations to try.

Compact Strings

In Java 9 on the other hand, compact strings are fully integrated into the JDK source. String is always backed by byte[], where characters use one byte if they are Latin-1 and otherwise two. Most operations do a check to see which is the case, e.g. charAt:

public char charAt(int index) {
    if (isLatin1()) {
        return StringLatin1.charAt(value, index);
    } else {
        return StringUTF16.charAt(value, index);
    }
}

Compact strings are enabled by default and can be partially disabled – “partially” because they are still backed by a byte[] and operations returning chars must still put them together from two separate bytes (due to intrinsics it is hard to say whether this has a performance impact).

More

If you’re interested in more background on compact strings I recommend to read the interview I linked to above and/or watch this great talk by the same Aleksey Shipilëv (which also explains the new string concatenation).

Leave a Comment