How to drop duplicates based on two or more subsets criteria in Pandas data-frame

Your syntax is wrong. Here’s the correct way:

df.drop_duplicates(subset=['bio', 'center', 'outcome'])

Or in this specific case, just simply:

df.drop_duplicates()

Both return the following:

  bio center outcome
0   1    one       f
2   1    two       f
3   4  three       f

Take a look at the df.drop_duplicates documentation for syntax details. subset should be a sequence of column labels.

More Related Contents:

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
Aggregation in Pandas
Groupby value counts on the dataframe pandas
Converting a Pandas GroupBy output from Series to DataFrame
Multiple aggregations of the same column using pandas GroupBy.agg()
Pandas groupby mean – into a dataframe?
Python Pandas max value in a group as a new column
How to loop over grouped Pandas dataframe?
How to access pandas groupby dataframe by key
Pandas Groupby and Sum Only One Column
Python Pandas Group by date using datetime data
Pandas get frequency of item occurrences in a column as percentage [duplicate]
Pandas – dataframe groupby – how to get sum of multiple columns
Group dataframe and get sum AND count?
How to summarize on different groupby combinations?
get first and last values in a groupby
Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on)
Get the row corresponding to the max in pandas GroupBy [duplicate]
How to get number of groups in a groupby object in pandas?
When is it appropriate to use df.value_counts() vs df.groupby(‘…’).count()?
How to change the order of DataFrame columns?
Split pandas dataframe based on groupby
Add missing dates to pandas dataframe
How to check if any value is NaN in a Pandas DataFrame
pandas dataframe groupby datetime month
Python Pandas update a dataframe value from another dataframe
python dataframe pandas drop column using int
How to read a list of parquet files from S3 as a pandas dataframe using pyarrow?
append dictionary to data frame
Python: Adding hours to pandas timestamp

More Related Contents:

Leave a Comment Cancel reply