Requests with multiple connections

You can use HTTP Range header to fetch just part of file (already covered for python here).

Just start several threads and fetch different range with each and you’re done 😉

def download(url,start):
    req = urllib2.Request('http://www.python.org/')
    req.headers['Range'] = 'bytes=%s-%s' % (start, start+chunk_size)
    f = urllib2.urlopen(req)
    parts[start] = f.read()

threads = []
parts = {}

# Initialize threads
for i in range(0,10):
    t = threading.Thread(target=download, i*chunk_size)
    t.start()
    threads.append(t)

# Join threads back (order doesn't matter, you just want them all)
for i in threads:
    i.join()

# Sort parts and you're done
result="".join(parts[i] for i in sorted(parts.keys()))

Also note that not every server supports Range header (and especially servers with php scripts responsible for data fetching often don’t implement handling of it).

Leave a Comment