Efficient way to filter one data frame by ranges in another

In the data.table package starting from v1.9.8, non-equi joins has been implemented. With this, I’ve created a wrapper function inrange() for exactly these kind of operations, where the task involves finding if a point lies in any of the intervals provided, and if so return TRUE, else FALSE.

require(data.table) # v>=1.9.8
setDT(main_data)[Day %inrange% spans_to_filter[, 2:3]] # inclusive bounds
#     Day
#  1:   1
#  2:   2
#  3:   3
#  4:   4
#  5:   5
#  6:   7
#  7:   8
#  8:   9
#  9:  10
# 10:  12
# 11:  13
# 12:  14
# 13:  15
# 14:  16
# 15:  17
# 16:  18
# 17:  23
# 18:  24
# 19:  25
# 20:  26

See ?inrange for more.

Leave a Comment