Using an HTTP PROXY – Python [duplicate]

You can do it even without the HTTP_PROXY environment variable. Try this sample: import urllib2 proxy_support = urllib2.ProxyHandler({“http”:”http://61.233.25.166:80″}) opener = urllib2.build_opener(proxy_support) urllib2.install_opener(opener) html = urllib2.urlopen(“http://www.google.com”).read() print html In your case it really seems that the proxy server is refusing the connection. Something more to try: import urllib2 #proxy = “61.233.25.166:80” proxy = “YOUR_PROXY_GOES_HERE” proxies = … Read more

Get page generated with Javascript in Python

You could use Selenium Webdriver: #!/usr/bin/env python from contextlib import closing from selenium.webdriver import Firefox # pip install selenium from selenium.webdriver.support.ui import WebDriverWait # use firefox to get page with javascript generated content with closing(Firefox()) as browser: browser.get(url) button = browser.find_element_by_name(‘button’) button.click() # wait for the page to load WebDriverWait(browser, timeout=10).until( lambda x: x.find_element_by_id(‘someId_that_must_be_on_new_page’)) # … Read more

Does python urllib2 automatically uncompress gzip data fetched from webpage?

How can I tell if the data at a URL is gzipped? This checks if the content is gzipped and decompresses it: from StringIO import StringIO import gzip request = urllib2.Request(‘http://example.com/’) request.add_header(‘Accept-encoding’, ‘gzip’) response = urllib2.urlopen(request) if response.info().get(‘Content-Encoding’) == ‘gzip’: buf = StringIO(response.read()) f = gzip.GzipFile(fileobj=buf) data = f.read() Does urllib2 automatically uncompress the data … Read more

Changing user agent on urllib2.urlopen

I answered a similar question a couple weeks ago. There is example code in that question, but basically you can do something like this: (Note the capitalization of User-Agent as of RFC 2616, section 14.43.) opener = urllib2.build_opener() opener.addheaders = [(‘User-Agent’, ‘Mozilla/5.0’)] response = opener.open(‘http://www.stackoverflow.com’)

Downloading a picture via urllib and python

Python 2 Using urllib.urlretrieve import urllib urllib.urlretrieve(“http://www.gunnerkrigg.com//comics/00000001.jpg”, “00000001.jpg”) Python 3 Using urllib.request.urlretrieve (part of Python 3’s legacy interface, works exactly the same) import urllib.request urllib.request.urlretrieve(“http://www.gunnerkrigg.com//comics/00000001.jpg”, “00000001.jpg”)

Import error: No module name urllib2

As stated in the urllib2 documentation: The urllib2 module has been split across several modules in Python 3 named urllib.request and urllib.error. The 2to3 tool will automatically adapt imports when converting your sources to Python 3. So you should instead be saying from urllib.request import urlopen html = urlopen(“http://www.google.com/”).read() print(html) Your current, now-edited code sample … Read more

What are the differences between the urllib, urllib2, urllib3 and requests module?

I know it’s been said already, but I’d highly recommend the requests Python package. If you’ve used languages other than python, you’re probably thinking urllib and urllib2 are easy to use, not much code, and highly capable, that’s how I used to think. But the requests package is so unbelievably useful and short that everyone … Read more