subset - w3toppers.com

Subset dataframe by multiple logical conditions of rows to remove

Try this subset(data, !(v1 %in% c(“b”,”d”,”e”)))

How do I extract a single column from a data.frame as a data.frame?

Use drop=FALSE > x <- df[,1, drop=FALSE] > x A 1 10 2 20 3 30 From the documentation (see ?”[“) you can find: If drop=TRUE the result is coerced to the lowest possible dimension.

How to replace NA with mean by group / subset?

Not my own technique I saw it on the boards a while back: dat <- read.table(text = “id taxa length width 101 collembola 2.1 0.9 102 mite 0.9 0.7 103 mite 1.1 0.8 104 collembola NA NA 105 collembola 1.5 0.5 106 mite NA NA”, header=TRUE) library(plyr) impute.mean <- function(x) replace(x, is.na(x), mean(x, na.rm = … Read more

Select groups with more than one distinct value

Several possibilities, here’s my favorite library(data.table) setDT(df)[, if(+var(number)) .SD, by = from] # from number # 1: 2 1 # 2: 2 2 Basically, per each group we are checking if there is any variance, if TRUE, then return the group values With base R, I would go with df[as.logical(with(df, ave(number, from, FUN = var))), … Read more

Find all possible subset combos in an array?

After stealing this JavaScript combination generator, I added a parameter to supply the minimum length resulting in, var combine = function(a, min) { var fn = function(n, src, got, all) { if (n == 0) { if (got.length > 0) { all[all.length] = got; } return; } for (var j = 0; j < src.length; … Read more

Subset / filter rows in a data frame based on a condition in a column

Here are the two main approaches. I prefer this one for its readability: bar <- subset(foo, location == “there”) Note that you can string together many conditionals with & and | to create complex subsets. The second is the indexing approach. You can index rows in R with either numeric, or boolean slices. foo$location == … Read more

How to drop columns by name in a data frame

You should use either indexing or the subset function. For example : R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8) R> df x y z u 1 1 2 3 4 2 2 3 4 5 3 3 4 5 6 4 4 5 6 7 5 5 6 7 8 Then you can use the … Read more

Opposite of %in%: exclude rows with values specified in a vector

You can use the ! operator to basically make any TRUE FALSE and every FALSE TRUE. so: D2 = subset(D1, !(V1 %in% c(‘B’,’N’,’T’))) EDIT: You can also make an operator yourself: ‘%!in%’ <- function(x,y)!(‘%in%'(x,y)) c(1,3,11)%!in%1:10 [1] FALSE FALSE TRUE

How to subset matrix to one column, maintain matrix data type, maintain row/column names?

Use the drop=FALSE argument to [. m <- matrix(1:10,5,2) rownames(m) <- 1:5 colnames(m) <- 1:2 m[,1] # vector m[,1,drop=FALSE] # matrix