Slow AES GCM encryption and decryption with Java 8u20

Micro-benchmarking aside, the performance of the GCM implementation in JDK 8 (at least up to 1.8.0_25) is crippled.

I can consistently reproduce the 3MB/s (on a Haswell i7 laptop) with a more mature micro-benchmark.

From a code dive, this appears to be due to a naive multiplier implementation and no hardware acceleration for the GCM calculations.

By comparison AES (in ECB or CBC mode) in JDK 8 uses an AES-NI accelerated intrinsic and is (for Java at least) very quick (in the order of 1GB/s on the same hardware), but the overall AES/GCM performance is completely dominated by the broken GCM performance.

There are plans to implement hardware acceleration, and there have been third party submissions to improve the performance with, but these haven’t made it to a release yet.

Something else to be aware of is that the JDK GCM implementation also buffers the entire plaintext on decryption until the authentication tag at the end of the ciphertext is verified, which cripples it for use with large messages.

Bouncy Castle has (at the time of writing) faster GCM implementations (and OCB if you’re writing open source software of not encumbered by software patent laws).


Updated July 2015 – 1.8.0_45 and JDK 9

JDK 8+ will get an improved (and constant time) Java implementation (contributed by Florian Weimer of RedHat) – this has landed in JDK 9 EA builds, but apparently not yet in 1.8.0_45.
JDK9 (since EA b72 at least) also has GCM intrinsics – AES/GCM speed on b72 is 18MB/s without intrinsics enabled and 25MB/s with intrinsics enabled, both of which are disappointing – for comparison the fastest (not constant time) BC implementation is ~60MB/s and the slowest (constant time, not fully optimised) is ~26MB/s.


Updated Jan 2016 – 1.8.0_72:

Some performance fixes landed in JDK 1.8.0_60 and performance on the same benchmark now is 18MB/s – a 6x improvement from the original, but still much slower than the BC implementations.

Leave a Comment