String Deduplication feature of Java 8

@assylias answer basiclly tells you how it work and is very good answer. I have tested a production application with String Deduplication and have some results. The web app heavily uses Strings so i think the advantage is pretty clear.

To enable String Deduplication you have to add these JVM params (you need at least Java 8u20):

-XX:+UseG1GC -XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics

The last one is optional but like the name says it shows you the String Deduplication statistics. Here are mine:

[GC concurrent-string-deduplication, 2893.3K->2672.0B(2890.7K), avg 97.3%, 0.0175148 secs]
   [Last Exec: 0.0175148 secs, Idle: 3.2029081 secs, Blocked: 0/0.0000000 secs]
      [Inspected:           96613]
         [Skipped:              0(  0.0%)]
         [Hashed:           96598(100.0%)]
         [Known:                2(  0.0%)]
         [New:              96611(100.0%)   2893.3K]
      [Deduplicated:        96536( 99.9%)   2890.7K( 99.9%)]
         [Young:                0(  0.0%)      0.0B(  0.0%)]
         [Old:              96536(100.0%)   2890.7K(100.0%)]
   [Total Exec: 452/7.6109490 secs, Idle: 452/776.3032184 secs, Blocked: 11/0.0258406 secs]
      [Inspected:        27108398]
         [Skipped:              0(  0.0%)]
         [Hashed:        26828486( 99.0%)]
         [Known:            19025(  0.1%)]
         [New:           27089373( 99.9%)    823.9M]
      [Deduplicated:     26853964( 99.1%)    801.6M( 97.3%)]
         [Young:             4732(  0.0%)    171.3K(  0.0%)]
         [Old:           26849232(100.0%)    801.4M(100.0%)]
   [Table]
      [Memory Usage: 2834.7K]
      [Size: 65536, Min: 1024, Max: 16777216]
      [Entries: 98687, Load: 150.6%, Cached: 415, Added: 252375, Removed: 153688]
      [Resize Count: 6, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
      [Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
      [Age Threshold: 3]
   [Queue]
      [Dropped: 0]

These are the results after running the app for 10 minutes. As you can see String Deduplication was executed 452 times and “deduplicated” 801.6 MB Strings. String Deduplication inspected 27 000 000 Strings. When i compared my memory consumption from Java 7 with the standard Parallel GC to Java 8u20 with the G1 GC and enabled String Deduplication the heap dropped approximatley 50%:

Java 7 Parallel GC

Java 7 Parallel GC

Java 8 G1 GC with String Deduplication

Java 8 G1 GC with String Deduplication

Leave a Comment