How to replace NA with mean by group / subset?

Not my own technique I saw it on the boards a while back: dat <- read.table(text = “id taxa length width 101 collembola 2.1 0.9 102 mite 0.9 0.7 103 mite 1.1 0.8 104 collembola NA NA 105 collembola 1.5 0.5 106 mite NA NA”, header=TRUE) library(plyr) impute.mean <- function(x) replace(x, is.na(x), mean(x, na.rm = … Read more

Select groups with more than one distinct value

Several possibilities, here’s my favorite library(data.table) setDT(df)[, if(+var(number)) .SD, by = from] # from number # 1: 2 1 # 2: 2 2 Basically, per each group we are checking if there is any variance, if TRUE, then return the group values With base R, I would go with df[as.logical(with(df, ave(number, from, FUN = var))), … Read more