Reading rather large JSON files [duplicate]

The issue here is that JSON, as a format, is generally parsed in full and then handled in-memory, which for such a large amount of data is clearly problematic.

The solution to this is to work with the data as a stream – reading part of the file, working with it, and then repeating.

The best option appears to be using something like ijson – a module that will work with JSON as a stream, rather than as a block file.

Edit: Also worth a look – kashif’s comment about json-streamer and Henrik Heino’s comment about bigjson.

Leave a Comment