Collapsing data frame by selecting one row per group

Maybe duplicated() can help:

R> d[ !duplicated(d$x), ]
  x  y  z
1 1 10 20
3 2 12 18
4 4 13 17
R>

Edit Shucks, never mind. This picks the first in each block of repetitions, you wanted the last. So here is another attempt using plyr:

R> ddply(d, "x", function(z) tail(z,1))
  x  y  z
1 1 11 19
2 2 12 18
3 4 13 17
R>

Here plyr does the hard work of finding unique subsets, looping over them and applying the supplied function — which simply returns the last set of observations in a block z using tail(z, 1).

More Related Contents:

Dynamically select data frame columns using $ and a character value
Simultaneously merge multiple data.frames in a list
Count number of rows within each group
Drop unused factor levels in a subsetted data frame
Why is it not advisable to use attach() in R, and what should I use instead?
Create an empty data.frame
Merging a lot of data.frames [duplicate]
Create a group number for each consecutive sequence
Subset data frame based on multiple conditions [duplicate]
dplyr: nonstandard column names (white space, punctuation, starts with numbers)
R – Concatenate two dataframes?
Remove columns with zero values from a dataframe
Calculating statistics on subsets of data [duplicate]
Identifying duplicate columns in a dataframe
R pass variable column indices to ggplot2 [duplicate]
Get last row of each group in R [duplicate]
Same function over multiple data frames in R
Create empty data frame with column names by assigning a string vector? [duplicate]
Dataframe create new column based on other columns
What is about the first column in R’s dataset mtcars?
Last Observation Carried Forward In a data frame? [duplicate]
Determine the data types of a data frame’s columns
Changing Million/Billion abbreviations into actual numbers? ie. 5.12M -> 5,120,000 [duplicate]
What is difference between dataframe and list in R?
Define dimensions of an empty dataframe
Merge overlapping ranges into unique groups, in dataframe
Deleting specific rows from a data frame
Reshape a dataframe to long format with multiple sets of measure columns [duplicate]
R how can I calculate difference between rows in a data frame
Color one point and add an annotation in ggplot2?

More Related Contents:

Leave a Comment Cancel reply