There is no difference between df[condition1 & condition2]
and df[(condition1) & (condition2)]
. The difference arises when you write an expression and the operator &
takes precedence:
df = pd.DataFrame(np.random.randint(0, 10, size=(5, 3)), columns=list('abc'))
df
Out:
a b c
0 5 0 3
1 3 7 9
2 3 5 2
3 4 7 6
4 8 8 1
condition1 = df['a'] > 3
condition2 = df['b'] < 5
df[condition1 & condition2]
Out:
a b c
0 5 0 3
df[(condition1) & (condition2)]
Out:
a b c
0 5 0 3
However, if you type it like this you’ll see an error:
df[df['a'] > 3 & df['b'] < 5]
Traceback (most recent call last):
File "<ipython-input-7-9d4fd21246ca>", line 1, in <module>
df[df['a'] > 3 & df['b'] < 5]
File "/home/ayhan/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 892, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
This is because 3 & df['b']
is evaluated first (this corresponds to False & df.col2.isnull()
in your example). So you need to group the conditions in parentheses:
df[(df['a'] > 3) & (df['b'] < 5)]
Out[8]:
a b c
0 5 0 3