`levels

The answers here are good, but they are missing an important point. Let me try and describe it.

R is a functional language and does not like to mutate its objects. But it does allow assignment statements, using replacement functions:

levels(x) <- y

is equivalent to

x <- `levels<-`(x, y)

The trick is, this rewriting is done by <-; it is not done by levels<-. levels<- is just a regular function that takes an input and gives an output; it does not mutate anything.

One consequence of that is that, according to the above rule, <- must be recursive:

levels(factor(x)) <- y

is

factor(x) <- `levels<-`(factor(x), y)

is

x <- `factor<-`(x, `levels<-`(factor(x), y))

It’s kind of beautiful that this pure-functional transformation (up until the very end, where the assignment happens) is equivalent to what an assignment would be in an imperative language. If I remember correctly this construct in functional languages is called a lens.

But then, once you have defined replacement functions like levels<-, you get another, unexpected windfall: you don’t just have the ability to make assignments, you have a handy function that takes in a factor, and gives out another factor with different levels. There’s really nothing “assignment” about it!

So, the code you’re describing is just making use of this other interpretation of levels<-. I admit that the name levels<- is a little confusing because it suggests an assignment, but this is not what is going on. The code is simply setting up a sort of pipeline:

  • Start with dat$product

  • Convert it to a factor

  • Change the levels

  • Store that in res

Personally, I think that line of code is beautiful 😉

Leave a Comment