python-2.x - w3toppers.com

How can I read large text files line by line, without loading it into memory?

I provided this answer because Keith’s, while succinct, doesn’t close the file explicitly with open(“log.txt”) as infile: for line in infile: do_something_with(line)

What is the difference between range and xrange functions in Python 2.X?

In Python 2.x: range creates a list, so if you do range(1, 10000000) it creates a list in memory with 9999999 elements. xrange is a sequence object that evaluates lazily. In Python 3: range does the equivalent of Python 2’s xrange. To get the list, you have to explicitly use list(range(…)). xrange no longer exists.

Why should we NOT use sys.setdefaultencoding(“utf-8”) in a py script?

As per the documentation: This allows you to switch from the default ASCII to other encodings such as UTF-8, which the Python runtime will use whenever it has to decode a string buffer to unicode. This function is only available at Python start-up time, when Python scans the environment. It has to be called in … Read more

Setting the correct encoding when piping stdout in Python

Your code works when run in an script because Python encodes the output to whatever encoding your terminal application is using. If you are piping you must encode it yourself. A rule of thumb is: Always use Unicode internally. Decode what you receive, and encode what you send. # -*- coding: utf-8 -*- print u”åäö”.encode(‘utf-8’) … Read more

How can I concatenate str and int objects?

The problem here is that the + operator has (at least) two different meanings in Python: for numeric types, it means “add the numbers together”: >>> 1 + 2 3 >>> 3.4 + 5.6 9.0 … and for sequence types, it means “concatenate the sequences”: >>> [1, 2, 3] + [4, 5, 6] [1, 2, … Read more

Why does the division get rounded to an integer?

You’re using Python 2.x, where integer divisions will truncate instead of becoming a floating point number. >>> 1 / 2 0 You should make one of them a float: >>> float(10 – 20) / (100 – 10) -0.1111111111111111 or from __future__ import division, which the forces / to adopt Python 3.x’s behavior that always returns … Read more

What is the best way to remove accents (normalize) in a Python unicode string?

Unidecode is the correct answer for this. It transliterates any unicode string into the closest possible representation in ascii text. Example: accented_string = u’Málaga’ # accented_string is of type ‘unicode’ import unidecode unaccented_string = unidecode.unidecode(accented_string) # unaccented_string contains ‘Malaga’and is of type ‘str’

How does Python 2 compare string and int? Why do lists compare as greater than numbers, and tuples greater than lists?

From the python 2 manual: CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address. When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for … Read more

How can I force division to be floating point? Division keeps rounding down to 0?

In Python 2, division of two ints produces an int. In Python 3, it produces a float. We can get the new behaviour by importing from __future__. >>> from __future__ import division >>> a = 4 >>> b = 6 >>> c = a / b >>> c 0.66666666666666663