Replace NaN with empty list in a pandas dataframe

This works using isnull and loc to mask the series:

In [90]:
d.loc[d.isnull()] = d.loc[d.isnull()].apply(lambda x: [])
d

Out[90]:
0    [1, 2, 3]
1       [1, 2]
2           []
3           []
dtype: object

In [91]:
d.apply(len)

Out[91]:
0    3
1    2
2    0
3    0
dtype: int64

You have to do this using apply in order for the list object to not be interpreted as an array to assign back to the df which will try to align the shape back to the original series

EDIT

Using your updated sample the following works:

In [100]:
d.loc[d['x'].isnull(),['x']] = d.loc[d['x'].isnull(),'x'].apply(lambda x: [])
d

Out[100]:
           x  y
0  [1, 2, 3]  1
1     [1, 2]  2
2         []  3
3         []  4

In [102]:    
d['x'].apply(len)

Out[102]:
0    3
1    2
2    0
3    0
Name: x, dtype: int64

Leave a Comment