the timing of String Literal loaded into StringTable in Java HotSpot vm

There is a corner case which allows to check within a Java application whether a string existed in the pool prior to the test, but it can be done only once per string. Together with string literals of the same content, the lazy loading can be detected:

public class Test {
    public static void main(String[] args) {
        test('h', 'e', 'l', 'l', 'o');
        test('m', 'a', 'i', 'n');
    }
    static void test(char... arg) {
        String s1 = new String(arg), s2 = s1.intern();
        System.out.println('"'+s1+'"'
            +(s1!=s2? " existed": " did not exist")+" in the pool before");
        System.out.println("is the same as \"hello\": "+(s2=="hello"));
        System.out.println("is the same as \"main\": "+(s2=="main"));
        System.out.println();
    }
}

The test first creates a new string instance which does not exist in the pool. Then it calls intern() on it and compares the references. There are three possible scenarios:

  1. If a string of the same contents exists in the pool, that string will be returned which must be a different object than our string not being in the pool.

  2. Our string is added to the pool and returned. In this case, the two references are identical.

  3. A new string with the same contents will be created and added to the pool. Then, the returned reference will be different.

We can’t distinguish between 1 and 3, so if a JVM generally adds new strings to the pool in intern(), we are out of luck. But if it adds the instance we’re calling intern() on, we can identify scenario 2 and know for sure that the string wasn’t in the pool, but has been added as a side effect of our test.

On my machine, it prints:

"hello" did not exist before
is the same as "hello": true
is the same as "main": false

"main" existed before
is the same as "hello": false
is the same as "main": true

Also on Ideone

showing that "hello" did not exist when entering the test method the first time, despite there is a string literal "hello" in the code later-on. So this proves that the string literal is resolved lazily. Since we already added a hello string manually, the string literal with the same contents will resolve to the same instance.

In contrast, the "main" string already exists in the pool, which is easy to explain. The Java launcher searches for the main method to execute, hence, adds that string to the pool as a side effect.

If we swap the order of the tests to test('m', 'a', 'i', 'n'); test('h', 'e', 'l', 'l', 'o'); the "hello" string literal will be used in the first test invocation and remains in the pool, so when we test it in the second invocation the string will already exist.

Leave a Comment