Pandas groupby to to_csv

Try doing this: week_grouped = df.groupby(‘week’) week_grouped.sum().reset_index().to_csv(‘week_grouped.csv’) That’ll write the entire dataframe to the file. If you only want those two columns then, week_grouped = df.groupby(‘week’) week_grouped.sum().reset_index()[[‘week’, ‘count’]].to_csv(‘week_grouped.csv’) Here’s a line by line explanation of the original code: # This creates a “groupby” object (not a dataframe object) # and you store it in the … Read more

Bar graph from dataframe groupby

Copy Data from OP and run df = pd.read_clipboard() Plot using pandas.DataFrame.plot Updated to pandas v1.2.4 and matplotlib v3.3.4 then using your code df = df.replace(np.nan, 0) dfg = df.groupby([‘home_team’])[‘arrests’].mean() dfg.plot(kind=’bar’, title=”Arrests”, ylabel=”Mean Arrests”, xlabel=”Home Team”, figsize=(6, 5))

pandas groupby where you get the max of one column and the min of another column

Use groupby + agg by dict, so then is necessary order columns by subset or reindex_axis. Last add reset_index for convert index to column if necessary. df = a.groupby(‘user’).agg({‘num1′:’min’, ‘num2′:’max’})[[‘num1′,’num2’]].reset_index() print (df) user num1 num2 0 a 1 3 1 b 4 5 What is same as: df = a.groupby(‘user’).agg({‘num1′:’min’, ‘num2′:’max’}) .reindex_axis([‘num1′,’num2’], axis=1) .reset_index() print … Read more

Pandas groupby with categories with redundant nan

Since Pandas 0.23.0, the groupby method can now take a parameter observed which fixes this issue if it is set to True (False by default). Below is the exact same code as in the question with just observed=True added : import pandas as pd group_cols = [‘Group1’, ‘Group2’, ‘Group3’] df = pd.DataFrame([[‘A’, ‘B’, ‘C’, 54.34], … Read more

Group dataframe and get sum AND count?

try this: In [110]: (df.groupby(‘Company Name’) …..: .agg({‘Organisation Name’:’count’, ‘Amount’: ‘sum’}) …..: .reset_index() …..: .rename(columns={‘Organisation Name’:’Organisation Count’}) …..: ) Out[110]: Company Name Amount Organisation Count 0 Vifor Pharma UK Ltd 4207.93 5 or if you don’t want to reset index: df.groupby(‘Company Name’)[‘Amount’].agg([‘sum’,’count’]) or df.groupby(‘Company Name’).agg({‘Amount’: [‘sum’,’count’]}) Demo: In [98]: df.groupby(‘Company Name’)[‘Amount’].agg([‘sum’,’count’]) Out[98]: sum count Company … Read more

Transform vs. aggregate in Pandas

consider the dataframe df df = pd.DataFrame(dict(A=list(‘aabb’), B=[1, 2, 3, 4], C=[0, 9, 0, 9])) groupby is the standard use aggregater df.groupby(‘A’).mean() maybe you want these values broadcast across the whole group and return something with the same index as what you started with. use transform df.groupby(‘A’).transform(‘mean’) df.set_index(‘A’).groupby(level=”A”).transform(‘mean’) agg is used when you have specific … Read more