Pandas: Find rows which don’t exist in another DataFrame by multiple columns

Since 0.17.0 there is a new indicator param you can pass to merge which will tell you whether the rows are only present in left, right or both:

In [5]:
merged = df.merge(other, how='left', indicator=True)
merged

Out[5]:
   col1 col2  extra_col     _merge
0     0    a       this  left_only
1     1    b         is       both
2     1    c       just  left_only
3     2    b  something  left_only

In [6]:    
merged[merged['_merge']=='left_only']

Out[6]:
   col1 col2  extra_col     _merge
0     0    a       this  left_only
2     1    c       just  left_only
3     2    b  something  left_only

So you can now filter the merged df by selecting only 'left_only' rows

Leave a Comment