Issue when passing variable with dollar sign notation ($) to aes() in combination with facet_grid() or facet_wrap()

tl;dr

Never use [ or $ inside aes().


Consider this illustrative example where the facetting variable f is purposely in a non-obvious order with respect to x

d <- data.frame(x=1:10, f=rev(letters[gl(2,5)]))

Now contrast what happens with these two plots,

p1 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(x, y=0, label=x, colour=f)) +
  ggtitle("good mapping") 

p2 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(d$x, y=0, label=x, colour=f)) +
  ggtitle("$ corruption") 

enter image description here

We can get a better idea of what’s happening by looking at the data.frame created internally by ggplot2 for each panel,

 ggplot_build(p1)[["data"]][[1]][,c("x","PANEL")]

    x PANEL
1   6     1
2   7     1
3   8     1
4   9     1
5  10     1
6   1     2
7   2     2
8   3     2
9   4     2
10  5     2

 ggplot_build(p2)[["data"]][[1]][,c("x", "PANEL")]

    x PANEL
1   1     1
2   2     1
3   3     1
4   4     1
5   5     1
6   6     2
7   7     2
8   8     2
9   9     2
10 10     2

The second plot has the wrong mapping, because when ggplot creates a data.frame for each panel, it picks x values in the “wrong” order.

This occurs because the use of $ breaks the link between the various variables to be mapped (ggplot must assume it’s an independent variable, which for all it knows could come from an arbitrary, disconnected source). Since the data.frame in this example is not ordered according to the factor f, the subset data.frames used internally for each panel assume the wrong order.

Leave a Comment