Pandas Groupby / List to Multiple Rows

IIUC, I think you can do it like this: dfg = df.groupby([‘AccountID’, ‘Last Name’, df.groupby([‘AccountID’, ‘Last Name’]).cumcount() + 1]).first().unstack() dfg.columns = [f'{i}{j}’ for i, j in dfg.columns] df_out = dfg.sort_index(axis=1, key=lambda x: x.str[-1]) df_out.reset_index() Output: AccountID Last Name Contract1 First Name1 Address1 City1 State1 Contract2 First Name2 Address2 City2 State2 Contract3 First Name3 Address3 City3 … Read more

Python Pandas Conditional Sum with Groupby

First groupby the key1 column: In [11]: g = df.groupby(‘key1′) and then for each group take the subDataFrame where key2 equals ‘one’ and sum the data1 column: In [12]: g.apply(lambda x: x[x[‘key2’] == ‘one’][‘data1′].sum()) Out[12]: key1 a 0.093391 b 1.468194 dtype: float64 To explain what’s going on let’s look at the ‘a’ group: In [21]: … Read more

Python Pandas Group by date using datetime data

You can use groupby by dates of column Date_Time by dt.date: df = df.groupby([df[‘Date_Time’].dt.date]).mean() Sample: df = pd.DataFrame({‘Date_Time’: pd.date_range(’10/1/2001 10:00:00′, periods=3, freq=’10H’), ‘B’:[4,5,6]}) print (df) B Date_Time 0 4 2001-10-01 10:00:00 1 5 2001-10-01 20:00:00 2 6 2001-10-02 06:00:00 print (df[‘Date_Time’].dt.date) 0 2001-10-01 1 2001-10-01 2 2001-10-02 Name: Date_Time, dtype: object df = df.groupby([df[‘Date_Time’].dt.date])[‘B’].mean() print(df) … Read more

How to drop duplicates based on two or more subsets criteria in Pandas data-frame

Your syntax is wrong. Here’s the correct way: df.drop_duplicates(subset=[‘bio’, ‘center’, ‘outcome’]) Or in this specific case, just simply: df.drop_duplicates() Both return the following: bio center outcome 0 1 one f 2 1 two f 3 4 three f Take a look at the df.drop_duplicates documentation for syntax details. subset should be a sequence of column … Read more

Use pandas.shift() within a group

Pandas’ grouped objects have a groupby.DataFrameGroupBy.shift method, which will shift a specified column in each group n periods, just like the regular dataframe’s shift method: df[‘prev_value’] = df.groupby(‘object’)[‘value’].shift() For the following example dataframe: print(df) object period value 0 1 1 24 1 1 2 67 2 1 4 89 3 2 4 5 4 2 … Read more