What is the idiomatic way to iterate over a binary file?

Try:

chunk_size = 4 * 1024 * 1024  # MB

with open('large_file.dat','rb') as f:
    for chunk in iter(lambda: f.read(chunk_size), b''):
        handle(chunk)

iter needs a function with zero arguments.

  • a plain f.read would read the whole file, since the size parameter is missing;
  • f.read(1024) means call a function and pass its return value (data loaded from file) to iter, so iter does not get a function at all;
  • (lambda:f.read(1234)) is a function that takes zero arguments (nothing between lambda and :) and calls f.read(1234).

Leave a Comment