The issue with the gzip module is not that it can’t decompress the partial file, the error occurs only at the end when it tries to verify the checksum of the decompressed content. (The original checksum is stored at the end of the compressed file so the verification will never, ever work with a partial file.)
The key is to trick gzip into skipping the verification. The answer by caesar0301 does this by modifying the gzip source code, but it’s not necessary to go that far, simple monkey patching will do. I wrote this context manager to temporarily replace gzip.GzipFile._read_eof
while I decompress the partial file:
import contextlib
@contextlib.contextmanager
def patch_gzip_for_partial():
"""
Context manager that replaces gzip.GzipFile._read_eof with a no-op.
This is useful when decompressing partial files, something that won't
work if GzipFile does it's checksum comparison.
"""
_read_eof = gzip.GzipFile._read_eof
gzip.GzipFile._read_eof = lambda *args, **kwargs: None
yield
gzip.GzipFile._read_eof = _read_eof
An example usage:
from cStringIO import StringIO
with patch_gzip_for_partial():
decompressed = gzip.GzipFile(StringIO(compressed)).read()