Sample random rows within each group in a data.table

Maybe something like this?

> DT[,.SD[sample(.N, min(3,.N))],by = a]
   a   b
1: 1 744
2: 1 497
3: 1 167
4: 2 888
5: 2 950
6: 2 343

(Thanks to Josh for the correction, below.)

More Related Contents:

Naming columns in a data table in R
How to create a lag variable within each group?
data.table objects assigned with := from within function not printed
.EACHI in data.table?
Use a value from the previous row in an R data.table calculation
How to efficiently calculate distance between pair of coordinates using data.table :=
data.table “key indices” or “group counter”
Faster way to read fixed-width files
Select subset of columns in data.table R [duplicate]
Melt using patterns when variable names contain string information – avoid coercion to numeric
Error: package or namespace load failed for ggplot2 and for data.table
How to perform join over date ranges using data.table?
Split text string in a data.table columns
R: data.table cross-join not working
Why is allow.cartesian required at times when when joining data.tables with duplicate keys?
Update subset of data.table based on join
Dynamically build call for lookup multiple columns
Add a row by reference at the end of a data.table object
How to speed up subset by groups
knitr gets tricked by data.table `:=` assignment
Add multiple columns to R data.table in one function call?
Subsetting data.table using variables with same name as column
Extract a column from a data.table as a vector, by position
data.table equivalent of tidyr::complete()
data.table merge based on date ranges
Reason behind speed of fread in data.table package in R
best way to transpose data.table
data.table row-wise sum, mean, min, max like dplyr?
Select NA in a data.table in R
Index unique values in data.table

More Related Contents:

Leave a Comment Cancel reply