Database Historization

Here is one way to do it:

enter image description here

Essentially, we never modify or delete the existing data. We “modify” it by creating a new version. We “delete” it by setting the DELETED flag.

For example:

  • If product changes the price, we insert a new row into PRODUCT_VERSION while old orders are kept connected to the old PRODUCT_VERSION and the old price.
  • When buyer changes the address, we simply insert a new row in CUSTOMER_VERSION and link new orders to that, while keeping the old orders linked to the old version.
  • If product is deleted, we don’t really delete it – we simply set the PRODUCT.DELETED flag, so all the orders historically made for that product stay in the database.
  • If customer is deleted (e.g. because (s)he requested to be unregistered), set the CUSTOMER.DELETED flag.

Caveats:

  • If product name needs to be unique, that can’t be enforced declaratively in the model above. You’ll either need to “promote” the NAME from PRODUCT_VERSION to PRODUCT, make it a key there and give-up ability to “evolve” product’s name, or enforce uniqueness on only latest PRODUCT_VER (probably through triggers).
  • There is a potential problem with the customer’s privacy. If a customer is deleted from the system, it may be desirable to physically remove its data from the database and just setting CUSTOMER.DELETED won’t do that. If that’s a concern, either blank-out the privacy-sensitive data in all the customer’s versions, or alternatively disconnect existing orders from the real customer and reconnect them to a special “anonymous” customer, then physically delete all the customer versions.

This model uses a lot of identifying relationships. This leads to “fat” foreign keys and could be a bit of a storage problem since MySQL doesn’t support leading-edge index compression (unlike, say, Oracle), but on the other hand InnoDB always clusters the data on PK and this clustering can be beneficial for performance. Also, JOINs are less necessary.

Equivalent model with non-identifying relationships and surrogate keys would look like this:

enter image description here

Leave a Comment