You can do this using two applications of itertools.groupby
, one to group by ID, and one to group by date.
The code below uses a triple-nested list comprehension, which is compact, but not so easy to read. I’ll post a longer version shortly.
from itertools import groupby
from operator import itemgetter
data=""'\
ID date product
A 01/01/2018 1
A 01/01/2018 2
A 02/01/2018 2
B 01/01/2018 3
B 01/01/2018 4
B 02/01/2018 2
B 04/01/2018 1
B 04/01/2018 2
B 04/01/2018 3
'''
data = (row.split() for row in data.splitlines())
#skip header
next(data)
result = [[[u[-1] for u in group]
for _, group in groupby(row, itemgetter(1))]
for _, row in groupby(data, itemgetter(0))]
print(result)
output
[[['1', '2'], ['2']], [['3', '4'], ['2'], ['1', '2', '3']]]
Here’s a version (mostly) using traditional for
loops. It also converts the product numbers from string to integer.
from itertools import groupby
from operator import itemgetter
data=""'\
ID date product
A 01/01/2018 1
A 01/01/2018 2
A 02/01/2018 2
B 01/01/2018 3
B 01/01/2018 4
B 02/01/2018 2
B 04/01/2018 1
B 04/01/2018 2
B 04/01/2018 3
'''
data = (row.split() for row in data.splitlines())
#skip header
next(data)
ig1 = itemgetter(1)
result = []
for _, row in groupby(data, itemgetter(0)):
sublist = []
for _, group in groupby(row, ig1):
sublist.append([int(u[-1]) for u in group])
result.append(sublist)
print(result)
output
[[[1, 2], [2]], [[3, 4], [2], [1, 2, 3]]]