How do I gzip compress a string in Python?

If you want to produce a complete gzip-compatible binary string, with the header etc, you could use gzip.GzipFile together with StringIO: try: from StringIO import StringIO # Python 2.7 except ImportError: from io import StringIO # Python 3.x import gzip out = StringIO() with gzip.GzipFile(fileobj=out, mode=”w”) as f: f.write(“This is mike number one, isn’t this … Read more

How do I enable gzip compression when using MVC3 on IIS7?

You can configure compression through your web.config file as follows: <system.webServer> <urlCompression doStaticCompression=”true” doDynamicCompression=”true” /> </system.webServer> You can find documentation of this configuration element at iis.net/ConfigReference. This is the equivalent of: Opening Internet Information Services (IIS Manager) Navigating through the tree-view on the left until you reach the virtual directory you wish to modify Selecting … Read more

Parsing compressed xml feed into ElementTree

You can pass the value returned by urlopen() directly to GzipFile() and in turn you can pass it to ElementTree methods such as iterparse(): #!/usr/bin/env python3 import xml.etree.ElementTree as etree from gzip import GzipFile from urllib.request import urlopen, Request with urlopen(Request(“http://smarkets.s3.amazonaws.com/oddsfeed.xml”, headers={“Accept-Encoding”: “gzip”})) as response, \ GzipFile(fileobj=response) as xml_file: for elem in getelements(xml_file, ‘interesting_tag’): process(elem) … Read more

Find the size of the file inside a GZIP file

There is no truly reliable way, other than gunzipping the stream. You do not need to save the result of the decompression, so you can determine the size by simply reading and decoding the entire file without taking up space with the decompressed result. There is an unreliable way to determine the uncompressed size, which … Read more

Reading in multiple files compressed in tar.gz archive into Spark [duplicate]

A solution is given in Read whole text files from a compression in Spark . Using the code sample provided, I was able to create a DataFrame from the compressed archive like so: val jsonRDD = sc.binaryFiles(“gzarchive/*”). flatMapValues(x => extractFiles(x).toOption). mapValues(_.map(decode()) val df = sqlContext.read.json(jsonRDD.map(_._2).flatMap(x => x)) This method works fine for tar archives of … Read more

How to force Apache to use manually pre-compressed gz file of CSS and JS files?

Some RewriteRule should handle that quite well. In a Drupal configuration file I found: # AddEncoding allows you to have certain browsers uncompress information on the fly. AddEncoding gzip .gz #Serve gzip compressed CSS files if they exist and the client accepts gzip. RewriteCond %{HTTP:Accept-encoding} gzip RewriteCond %{REQUEST_FILENAME}\.gz -s RewriteRule ^(.*)\.css $1\.css\.gz [QSA] # Serve … Read more

Download and decompress gzipped file in memory?

You need to seek to the beginning of compressedFile after writing to it but before passing it to gzip.GzipFile(). Otherwise it will be read from the end by gzip module and will appear as an empty file to it. See below: #! /usr/bin/env python import urllib2 import StringIO import gzip baseURL = “https://www.kernel.org/pub/linux/docs/man-pages/” filename = … Read more