Python ungzipping stream of bytes?

Yes, you can use the zlib module to decompress byte streams:

import zlib

def stream_gzip_decompress(stream):
    dec = zlib.decompressobj(32 + zlib.MAX_WBITS)  # offset 32 to skip the header
    for chunk in stream:
        rv = dec.decompress(chunk)
        if rv:
            yield rv

The offset of 32 signals to the zlib header that the gzip header is expected but skipped.

The S3 key object is an iterator, so you can do:

for data in stream_gzip_decompress(k):
    # do something with the decompressed data

Leave a Comment