How do I re.search or re.match on a whole file without reading it all into memory?

You can use mmap to map the file to memory. The file contents can then be accessed like a normal string:

import re, mmap

with open('/var/log/error.log', 'r+') as f:
  data = mmap.mmap(f.fileno(), 0)
  mo = re.search('error: (.*)', data)
  if mo:
    print "found error", mo.group(1)

This also works for big files, the file content is internally loaded from disk as needed.

Leave a Comment