Query for count of distinct values in a rolling date range

Test case: CREATE TABLE tbl (date date, email text); INSERT INTO tbl VALUES (‘2012-01-01’, ‘[email protected]’) , (‘2012-01-01’, ‘[email protected]’) , (‘2012-01-01’, ‘[email protected]’) , (‘2012-01-02’, ‘[email protected]’) , (‘2012-01-02’, ‘[email protected]’) , (‘2012-01-03’, ‘[email protected]’) , (‘2012-01-04’, ‘[email protected]’) , (‘2012-01-05’, ‘[email protected]’) , (‘2012-01-05’, ‘[email protected]’) , (‘2012-01-06’, ‘[email protected]’) , (‘2012-01-06’, ‘[email protected]’) , (‘2012-01-06’, ‘[email protected]`’) ; Query – returns only days where … Read more

Pandas rolling apply using multiple columns

How about this: def masscenter(ser): print(df.loc[ser.index]) return 0 rol = df.price.rolling(window=2) rol.apply(masscenter, raw=False) It uses the rolling logic to get subsets from an arbitrary column. The raw=False option provides you with index values for those subsets (which are given to you as Series), then you use those index values to get multi-column slices from your … Read more

Python – rolling functions for GroupBy object

For the Googlers who come upon this old question: Regarding @kekert’s comment on @Garrett’s answer to use the new df.groupby(‘id’)[‘x’].rolling(2).mean() rather than the now-deprecated df.groupby(‘id’)[‘x’].apply(pd.rolling_mean, 2, min_periods=1) curiously, it seems that the new .rolling().mean() approach returns a multi-indexed series, indexed by the group_by column first and then the index. Whereas, the old approach would simply … Read more

Pandas: rolling mean by time interval

In the meantime, a time-window capability was added. See this link. In [1]: df = DataFrame({‘B’: range(5)}) In [2]: df.index = [Timestamp(‘20130101 09:00:00’), …: Timestamp(‘20130101 09:00:02’), …: Timestamp(‘20130101 09:00:03’), …: Timestamp(‘20130101 09:00:05’), …: Timestamp(‘20130101 09:00:06’)] In [3]: df Out[3]: B 2013-01-01 09:00:00 0 2013-01-01 09:00:02 1 2013-01-01 09:00:03 2 2013-01-01 09:00:05 3 2013-01-01 09:00:06 4 … Read more

How to calculate rolling / moving average using python + NumPy / SciPy?

If you just want a straightforward non-weighted moving average, you can easily implement it with np.cumsum, which may be is faster than FFT based methods: EDIT Corrected an off-by-one wrong indexing spotted by Bean in the code. EDIT def moving_average(a, n=3) : ret = np.cumsum(a, dtype=float) ret[n:] = ret[n:] – ret[:-n] return ret[n – 1:] … Read more