This is a copy of the explanation on github.
There is no guarantee that an inplace
operation is actually faster. Often they are actually the same operation that works on a copy, but the top-level reference is reassigned.
The reason for the difference in performance in this case is as follows.
The (df1-df2).dropna()
call creates a slice of the dataframe. When you apply a new operation, this triggers a SettingWithCopy
check because it could be a copy (but often is not).
This check must perform a garbage collection to wipe out some cache references to see if it’s a copy. Unfortunately python syntax makes this unavoidable.
You can not have this happen, by simply making a copy first.
df = (df1-df2).dropna().copy()
followed by an inplace
operation will be as performant as before.
My personal opinion: I never use in-place operations. The syntax is harder to read and it does not offer any advantages.