You are using the decoded unicode value. Use r.raw
raw response data instead:
r = requests.get(url, params=payload, stream=True)
r.raw.decode_content = True
etree.parse(r.raw)
which will read the data from the response directly; do note the stream=True
option to .get()
.
Setting the r.raw.decode_content = True
flag ensures that the raw socket will give you the decompressed content even if the response is gzip or deflate compressed.
You don’t have to stream the response; for smaller XML documents it is fine to use the response.content
attribute, which is the un-decoded response body:
r = requests.get(url, params=payload)
xml = etree.fromstring(r.content)
XML parsers always expect bytes as input as the XML format itself dictates how the parser is to decode those bytes to Unicode text.