pd.Timestamp versus np.datetime64: are they interchangeable for selected uses?

In my opinion, you should always prefer using a Timestamp – it can easily transform back into a numpy datetime in the case it is needed.

numpy.datetime64 is essentially a thin wrapper for int64. It has almost no date/time specific functionality.

pd.Timestamp is a wrapper around a numpy.datetime64. It is backed by the same int64 value, but supports the entire datetime.datetime interface, along with useful pandas-specific functionality.

The in-array representation of these two is identical – it is a contigous array of int64s. pd.Timestamp is a scalar box that makes working with individual values easier.

Going back to the linked answer, you could write it like this, which is shorter and happens to be faster.

%timeit (df.index.values >= pd.Timestamp('2011-01-02').to_datetime64()) & \
        (df.index.values < pd.Timestamp('2011-01-03').to_datetime64())
192 µs ± 6.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

More Related Contents:

Leave a Comment Cancel reply