Automatically create formulas for all possible linear models

Say we work with this ridiculous example :

DF <- data.frame(Class=1:10,A=1:10,B=1:10,C=1:10)

Then you get the names of the columns

Cols <- names(DF)
Cols <- Cols[! Cols %in% "Class"]
n <- length(Cols)

You construct all possible combinations

id <- unlist(
        lapply(1:n,
              function(i)combn(1:n,i,simplify=FALSE)
        )
      ,recursive=FALSE)

You paste them to formulas

Formulas <- sapply(id,function(i)
              paste("Class~",paste(Cols[i],collapse="+"))
            )

And you loop over them to apply the models.

lapply(Formulas,function(i)
    lm(as.formula(i),data=DF))

Be warned though: if you have more than a handful columns, this will quickly become very heavy on the memory and result in literally thousands of models. You have 2^n – 1 different models with n being the number of columns.

Make very sure that is what you want, in general this kind of model comparison is strongly advised against. Forget about any kind of inference as well when you do this.

Leave a Comment