For a detailed explanation of compressed oops, see the “Compressed oops in the Hotspot JVM” article by John Rose @ Oracle.
The TL;DR version is:
- on modern computer architectures, memory addresses are byte addresses,
- Java object references are addresses that point to the start of a word1,
- on a 64-bit machine, word alignment means that that the bottom 3 bits of an object reference / address are zero2
- so, by shifting an address 3 bits to the right, we can “compress” up to a 35 bits of a 64 bit address into a 32-bit word,
- and, decompression can be done by shifting 3 bits to the left, which puts those 3 zero bits back,
- 35 bits of addressing allows us to represent object pointers for up to 32 GB of heap memory using compressed oops that fit in 32-bit (half-)words on a 64-bit machine.
Note that this only works on a 64-bit JVM. We still need to be able to address the memory containing that (up to) 32 GB heap1, and that means 64-bit hardware addresses (on modern CPUs / computer architectures).
Note also that there is a small penalty in doing this; i.e. the shift instructions required to translate between regular and compressed references. However, the flip side is that less actual memory is consumed3, and memory caches are typically more effective as a consequence.
1 – This is because modern computer architectures are optimized for word-aligned memory access.
2 – This assumes that you haven’t used
-XX:ObjectAlignmentInBytes to increase the alignment from its default (and minimum) value of 8 bytes.
3 – In fact, the memory saving is application specific. It depends on the average object alignment wastage, ratios of reference to non-reference fields and so on. It gets more complicated if you consider tuning the object alignment.
To simplify the problem how can we address up to 24 memory addresses using just 2 bits? What can be a possible encoding/decoding of such an address scheme?
You can’t address 24 byte addresses. But you can address 22 word addresses (assuming 32-bit words) using 2-bit word addresses. If you can assume that all byte addresses are word-aligned, then you can compress a 4-bit byte address as 2-bit word address by shifting it by 2-bit positions.