Combine two pandas Data Frames (join on a common column)

You can use merge to combine two dataframes into one:

import pandas as pd
pd.merge(restaurant_ids_dataframe, restaurant_review_frame, on='business_id', how='outer')

where on specifies field name that exists in both dataframes to join on, and how
defines whether its inner/outer/left/right join, with outer using ‘union of keys from both frames (SQL: full outer join).’ Since you have ‘star’ column in both dataframes, this by default will create two columns star_x and star_y in the combined dataframe. As @DanAllan mentioned for the join method, you can modify the suffixes for merge by passing it as a kwarg. Default is suffixes=('_x', '_y'). if you wanted to do something like star_restaurant_id and star_restaurant_review, you can do:

 pd.merge(restaurant_ids_dataframe, restaurant_review_frame, on='business_id', how='outer', suffixes=('_restaurant_id', '_restaurant_review'))

The parameters are explained in detail in this link.

Leave a Comment