Collapsing data frame by selecting one row per group

Maybe duplicated() can help:

R> d[ !duplicated(d$x), ]
  x  y  z
1 1 10 20
3 2 12 18
4 4 13 17
R> 

Edit Shucks, never mind. This picks the first in each block of repetitions, you wanted the last. So here is another attempt using plyr:

R> ddply(d, "x", function(z) tail(z,1))
  x  y  z
1 1 11 19
2 2 12 18
3 4 13 17
R> 

Here plyr does the hard work of finding unique subsets, looping over them and applying the supplied function — which simply returns the last set of observations in a block z using tail(z, 1).

Leave a Comment