Python: Trying to Deserialize Multiple JSON objects in a file with each object spanning multiple but consistently spaced number of lines

Load 6 extra lines instead, and pass the string to json.loads():

with open(file) as f:
    for line in f:
        # slice the next 6 lines from the iterable, as a list.
        lines = [line] + list(itertools.islice(f, 6))
        jfile = json.loads(''.join(lines))

        # do something with jfile

json.load() will slurp up more than just the next object in the file, and islice(f, 0, 7) would read only the first 7 lines, rather than read the file in 7-line blocks.

You can wrap reading a file in blocks of size N in a generator:

from itertools import islice, chain

def lines_per_n(f, n):
    for line in f:
        yield ''.join(chain([line], itertools.islice(f, n - 1)))

then use that to chunk up your input file:

with open(file) as f:
    for chunk in lines_per_n(f, 7):
        jfile = json.loads(chunk)

        # do something with jfile

Alternatively, if your blocks turn out to be of variable length, read until you have something that parses:

with open(file) as f:
    for line in f:
        while True:
            try:
                jfile = json.loads(line)
                break
            except ValueError:
                # Not yet a complete JSON value
                line += next(f)

        # do something with jfile

Leave a Comment