Submitting to a web form using python

If you want to pass q as a parameter in the URL using requests, use the params argument, not data (see Passing Parameters In URLs): r = requests.get(‘http://stackoverflow.com’, params=data) This will request https://stackoverflow.com/?q=%5Bpython%5D , which isn’t what you are looking for. You really want to POST to a form. Try this: r = requests.post(‘https://stackoverflow.com/search’, data=data) … Read more

How to download any(!) webpage with correct charset in python?

When you download a file with urllib or urllib2, you can find out whether a charset header was transmitted: fp = urllib2.urlopen(request) charset = fp.headers.getparam(‘charset’) You can use BeautifulSoup to locate a meta element in the HTML: soup = BeatifulSoup.BeautifulSoup(data) meta = soup.findAll(‘meta’, {‘http-equiv’:lambda v:v.lower()==’content-type’}) If neither is available, browsers typically fall back to user … Read more

Get size of a file before downloading in Python

I have reproduced what you are seeing: import urllib, os link = “http://python.org” print “opening url:”, link site = urllib.urlopen(link) meta = site.info() print “Content-Length:”, meta.getheaders(“Content-Length”)[0] f = open(“out.txt”, “r”) print “File on disk:”,len(f.read()) f.close() f = open(“out.txt”, “w”) f.write(site.read()) site.close() f.close() f = open(“out.txt”, “r”) print “File on disk after download:”,len(f.read()) f.close() print “os.stat().st_size … Read more