Merge dataframes of different sizes

From your description I understand that you want to replace the z values in d1 with the z values in d2 when x & y match.

Using base R:

d3 <- merge(d1, d2, by = c("x","y"), all.x = TRUE)
d3[is.na(d3$z.y),"z.y"] <- d3[is.na(d3$z.y),"z.x"]
d3 <- d3[,-3]
names(d3)[3] <- "z"

which gives:

> d3
   x  y   z
1 10 10 100
2 10 12   6
3 11 10 200
4 11 12   2
5 12 10   1
6 12 12 400

Using the data.table-package:

library(data.table)

setDT(d1) # convert the data.frame to a data.table
setDT(d2) # idem

# join the two data.table's and replace the values
d1[d2, on = .(x, y), z := i.z]

or in one go:

setDT(d1)[setDT(d2), on = .(x, y), z := i.z]

which gives:

> d1
    x  y   z
1: 10 10 100
2: 10 12   6
3: 11 10 200
4: 11 12   2
5: 12 10   1
6: 12 12 400

Using the dplyr package:

d3 <- left_join(d1, d2, by = c("x","y")) %>%
  mutate(z.y = ifelse(is.na(z.y), z.x, z.y)) %>%
  select(-z.x) %>%
  rename(z = z.y)

Since release 0.5.0 you can also use the coalesce-function for this (thx to Laurent Hostert for bringing it to my attention):

d3 <- left_join(d1, d2, by = c("x","y")) %>% 
  mutate(z = coalesce(z.y, z.x)) %>% 
  select(-c(z.x, z.y))

Leave a Comment