group-by
Get the row corresponding to the max in pandas GroupBy [duplicate]
Check with sort_values +drop_duplicates df.sort_values(‘B’).drop_duplicates([‘A’],keep=’last’) Out[127]: A B C 1 1 1 b 3 2 3 d
LINQ to Entities group-by failure using .date
Use the EntityFunctions.TruncateTime method: var myQuery = from p in dbContext.Trends group p by EntityFunctions.TruncateTime(p.UpdateDateTime) into g select new { k = g.Key, ud = g.Max(p => p.Amount) };
sql query – how to apply limit within group by
Here’s a fairly portable query to do what you want: SELECT * FROM table1 a WHERE a.”ROWID” IN ( SELECT b.”ROWID” FROM table1 b WHERE b.”Score” >= 20 AND b.”ROWID” IS NOT NULL AND a.”CID” = b.”CID” ORDER BY b.”CID”, b.”SortKey” LIMIT 2 ) ORDER BY a.”CID”, a.”SortKey”; The query uses a correlated subquery with … Read more
group by range in mysql
Here is general code to group by range since doing a case statement gets pretty cumbersome. The function ‘floor’ can be used to find the bottom of the range (not ’round’ as Bohemian used), and add the amount (19 in the example below) to find the top of the range. Remember to not overlap the … Read more
Why doesn’t Oracle SQL allow us to use column aliases in GROUP BY clauses?
It isn’t just Oracle SQL, in fact I believe it is conforming to the ANSI SQL standard (though I don’t have a reference for that). The reason is that the SELECT clause is logically processed after the GROUP BY clause, so at the time the GROUP BY is done the aliases don’t yet exist. Perhaps … Read more
how to convert monthly data to quarterly in pandas
you can use pd.PeriodIndex(…, freq=’Q’) in conjunction with groupby(…, axis=1): In [63]: df Out[63]: 1996-04 1996-05 2000-07 2000-08 2010-10 2010-11 2010-12 0 1 2 3 4 1 1 1 1 25 19 37 40 1 2 3 2 10 20 30 40 4 4 5 In [64]: df.groupby(pd.PeriodIndex(df.columns, freq=’Q’), axis=1).mean() Out[64]: 1996Q2 2000Q3 2010Q4 0 … Read more
Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on)
agg with a dict of functions Create a dict of functions and pass it to agg. You’ll also need as_index=False to prevent the group columns from becoming the index in your output. f = {‘NET_AMT’: ‘sum’, ‘QTY_SOLD’: ‘sum’, ‘UPC_DSC’: ‘first’} df.groupby([‘month’, ‘UPC_ID’], as_index=False).agg(f) month UPC_ID UPC_DSC NET_AMT QTY_SOLD 0 2017.02 111 desc1 10 2 1 … Read more
Pandas groupby and qcut
import pandas as pd df = pd.DataFrame({‘A’:’foo foo foo bar bar bar’.split(), ‘B’:[0.1, 0.5, 1.0]*2}) df[‘C’] = df.groupby([‘A’])[‘B’].transform( lambda x: pd.qcut(x, 3, labels=range(1,4))) print(df) yields A B C 0 foo 0.1 1 1 foo 0.5 2 2 foo 1.0 3 3 bar 0.1 1 4 bar 0.5 2 5 bar 1.0 3
What is the correct way to do a HAVING in a MongoDB GROUP BY?
New answer using Mongo aggregation framework After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework. The new best way to do this query is: db.col.aggregate( [ { $group: { _id: { userId: “$userId”, name: “$name” }, count: { $sum: 1 } } }, { $match: { count: { … Read more