Best Compression algorithm for a sequence of integers

We have written recent research papers that survey the best schemes for this problem. Please see:

Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization,Software: Practice & Experience 45 (1), 2015.
http://arxiv.org/abs/1209.2137

Daniel Lemire, Nathan Kurz, Leonid Boytsov, SIMD Compression and the Intersection of Sorted Integers, Software: Practice and Experience (to appear) http://arxiv.org/abs/1401.6399

They include an extensive experimental evaluation.

You can find a complete implementation of all techniques in C++11 online:
https://github.com/lemire/FastPFor and https://github.com/lemire/SIMDCompressionAndIntersection

There are also C libraries: https://github.com/lemire/simdcomp and https://github.com/lemire/MaskedVByte

If you prefer Java, please see https://github.com/lemire/JavaFastPFOR

Leave a Comment