Deleting reversed duplicates with R

mydf <- read.table(text="gene_x    gene_y
AT1       AT2
AT3       AT4
AT1       AT2
AT1       AT3
AT2       AT1", header=TRUE, stringsAsFactors=FALSE)

Here’s one strategy using apply, sort, paste, and duplicated:

mydf[!duplicated(apply(mydf,1,function(x) paste(sort(x),collapse=""))),]
  gene_x gene_y
1    AT1    AT2
2    AT3    AT4
4    AT1    AT3

And here’s a slightly different solution:

mydf[!duplicated(lapply(as.data.frame(t(mydf), stringsAsFactors=FALSE), sort)),]
  gene_x gene_y
1    AT1    AT2
2    AT3    AT4
4    AT1    AT3

More Related Contents:

Split data frame string column into multiple columns
Repeat each row of data.frame the number of times specified in a column
Filter data.frame rows by a logical condition
Split column at delimiter in data frame [duplicate]
Pass a data.frame column name to a function
Select groups based on number of unique / distinct values
Assign multiple columns using := in data.table, by group
Capitalize the first letter of both words in a two word string
How to replace NA values in a table for selected columns
Remove an entire column from a data.frame in R
Count the number of all words in a string
Selecting multiple odd or even columns/rows for dataframe
Sample n random rows per group in a dataframe
Split data.frame by value
How to use a string variable to select a data frame column using $ notation [duplicate]
Growing a data.frame in a memory-efficient manner
R list of lists to data.frame
Remove columns from dataframe where some of values are NA
Splitting a data.frame by a variable [duplicate]
Check whether values in one data frame column exist in a second data frame
How do you remove columns from a data.frame?
Deleting columns from a data.frame where NA is more than 15% of the column length [duplicate]
Using lapply to apply a function over list of data frames and saving output to files with different names
DT[!(x == .)] and DT[x != .] treat NA in x inconsistently
Return df with a columns values that occur more than once [duplicate]
Row-wise sum of values grouped by columns with same name
Replicate each row of data.frame and specify the number of replications for each row?
Undefined columns selected when subsetting data frame
Difference between `names(df[1])
Data.frame Merge and Selection of values which are common in 2 Data.frames

More Related Contents:

Leave a Comment Cancel reply