How to quickly form groups (quartiles, deciles, etc) by ordering column(s) in a data frame

There’s a handy ntile function in package dplyr. It’s flexible in the sense that you can very easily define the number of *tiles or “bins” you want to create.

Load the package (install first if you haven’t) and add the quartile column:

library(dplyr)
temp$quartile <- ntile(temp$value, 4)  

Or, if you want to use dplyr syntax:

temp <- temp %>% mutate(quartile = ntile(value, 4))

Result in both cases is:

temp
#   name       value quartile
#1     a -0.56047565        1
#2     b -0.23017749        2
#3     c  1.55870831        4
#4     d  0.07050839        2
#5     e  0.12928774        3
#6     f  1.71506499        4
#7     g  0.46091621        3
#8     h -1.26506123        1
#9     i -0.68685285        1
#10    j -0.44566197        2
#11    k  1.22408180        4
#12    l  0.35981383        3

data:

Note that you don’t need to create the “quartile” column in advance and use set.seed to make the randomization reproducible:

set.seed(123)
temp <- data.frame(name=letters[1:12], value=rnorm(12))

Leave a Comment