How do I prevent Python’s urllib(2) from following a redirect

You could do a couple of things: Build your own HTTPRedirectHandler that intercepts each redirect Create an instance of HTTPCookieProcessor and install that opener so that you have access to the cookiejar. This is a quick little thing that shows both import urllib2 #redirect_handler = urllib2.HTTPRedirectHandler() class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler): def http_error_302(self, req, fp, code, msg, headers): … Read more

Using MultipartPostHandler to POST form-data with Python

It seems that the easiest and most compatible way to get around this problem is to use the ‘poster’ module. # test_client.py from poster.encode import multipart_encode from poster.streaminghttp import register_openers import urllib2 # Register the streaming http handlers with urllib2 register_openers() # Start the multipart/form-data encoding of the file “DSC0001.jpg” # “image1” is the name … Read more

Python Urllib2 SSL error

To summarize the comments about the cause of the problem and explain the real problem in more detail: If you check the trust chain for the OpenSSL client you get the following: [0] 54:7D:B3:AC:BF:… /CN=*.s3.amazonaws.com [1] 5D:EB:8F:33:9E:… /CN=VeriSign Class 3 Secure Server CA – G3 [2] F4:A8:0A:0C:D1:… /CN=VeriSign Class 3 Public Primary Certification Authority – … Read more

Python: download files from google drive using url

If by “drive’s url” you mean the shareable link of a file on Google Drive, then the following might help: import requests def download_file_from_google_drive(id, destination): URL = “https://docs.google.com/uc?export=download” session = requests.Session() response = session.get(URL, params = { ‘id’ : id }, stream = True) token = get_confirm_token(response) if token: params = { ‘id’ : id, … Read more

A good way to get the charset/encoding of an HTTP response in Python

To parse http header you could use cgi.parse_header(): _, params = cgi.parse_header(‘text/html; charset=utf-8’) print params[‘charset’] # -> utf-8 Or using the response object: response = urllib2.urlopen(‘http://example.com’) response_encoding = response.headers.getparam(‘charset’) # or in Python 3: response.headers.get_content_charset(default) In general the server may lie about the encoding or do not report it at all (the default depends on … Read more

Python: URLError:

The error code 10060 means it cannot connect to the remote peer. It might be because of the network problem or mostly your setting issues, such as proxy setting. You could try to connect the same host with other tools(such as ncat) and/or with another PC within your same local network to find out where … Read more

How can I use a SOCKS 4/5 proxy with urllib2?

You can use SocksiPy module. Simply copy the file “socks.py” to your Python’s lib/site-packages directory, and you’re ready to go. You must use socks before urllib2. (Try it pip install PySocks ) For example: import socks import socket socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, “127.0.0.1”, 8080) socket.socket = socks.socksocket import urllib2 print urllib2.urlopen(‘http://www.google.com’).read() You can also try pycurl lib and … Read more