Ways to implement data versioning in MongoDB

The first big question when diving in to this is “how do you want to store changesets”?

  1. Diffs?
  2. Whole record copies?

My personal approach would be to store diffs. Because the display of these diffs is really a special action, I would put the diffs in a different “history” collection.

I would use the different collection to save memory space. You generally don’t want a full history for a simple query. So by keeping the history out of the object you can also keep it out of the commonly accessed memory when that data is queried.

To make my life easy, I would make a history document contain a dictionary of time-stamped diffs. Something like this:

{
    _id : "id of address book record",
    changes : { 
                1234567 : { "city" : "Omaha", "state" : "Nebraska" },
                1234568 : { "city" : "Kansas City", "state" : "Missouri" }
               }
}

To make my life really easy, I would make this part of my DataObjects (EntityWrapper, whatever) that I use to access my data. Generally these objects have some form of history, so that you can easily override the save() method to make this change at the same time.

UPDATE: 2015-10

It looks like there is now a spec for handling JSON diffs. This seems like a more robust way to store the diffs / changes.

Leave a Comment