How to group by a Calculated Field

Sure, just add the same calculation to the GROUP BY clause: select dateadd(day, -7, Convert(DateTime, mwspp.DateDue) + (7 – datepart(weekday, mwspp.DateDue))), sum(mwspp.QtyRequired) from manufacturingweekshortagepartpurchasing mwspp where mwspp.buildScheduleSimID = 10109 and mwspp.partID = 8366 group by dateadd(day, -7, Convert(DateTime, mwspp.DateDue) + (7 – datepart(weekday, mwspp.DateDue))) order by dateadd(day, -7, Convert(DateTime, mwspp.DateDue) + (7 – datepart(weekday, mwspp.DateDue))) … Read more

How to get number of groups in a groupby object in pandas?

Simple, Fast, and Pandaic: ngroups Newer versions of the groupby API (pandas >= 0.23) provide this (undocumented) attribute which stores the number of groups in a GroupBy object. # setup df = pd.DataFrame({‘A’: list(‘aabbcccd’)}) dfg = df.groupby(‘A’) # call `.ngroups` on the GroupBy object dfg.ngroups # 4 Note that this is different from GroupBy.groups which … Read more

Pandas: groupby forward fill with datetime index

One way is to use the transform function to fill the value column after group by: import pandas as pd a[‘value’] = a.groupby(‘company’)[‘value’].transform(lambda v: v.ffill()) a # company value #level_1 #2010-01-01 a 1.0 #2010-01-01 b 12.0 #2011-01-01 a 2.0 #2011-01-01 b 12.0 #2012-01-01 a 2.0 #2012-01-01 b 14.0 To compare, the original data frame looks … Read more

SQL query with avg and group by

If I understand what you need, try this: SELECT id, pass, AVG(val) AS val_1 FROM data_r1 GROUP BY id, pass; Or, if you want just one row for every id, this: SELECT d1.id, (SELECT IFNULL(ROUND(AVG(d2.val), 4) ,0) FROM data_r1 d2 WHERE d2.id = d1.id AND pass = 1) as val_1, (SELECT IFNULL(ROUND(AVG(d2.val), 4) ,0) FROM … Read more

Aggregated query without GROUP BY

A change was made in version 5.7.5 where it will now, by default, reject queries in which you aggregate using a function (sum, avg, max, etc.) in the SELECT clause and fail to put the non-aggregated fields in the GROUP BY clause. This behavior is part and parcel to every other RDBMS and MySQL is … Read more

Scala-Spark Dynamically call groupby and agg with parameter values

Your code is almost correct – with two issues: The return type of your function is DataFrame, but the last line is aggregated.show(), which returns Unit. Remove the call to show to return aggregated itself, or just return the result of agg immediately DataFrame.groupBy expects arguments as follows: col1: String, cols: String* – so you … Read more