A column as primary key or two foreign keys as primary key

My question is about EReportReq table. I’m not sure if I need a column as primary key (EReportReqId) or I can use eReportId and requirementId as primary key.

You can use either of them – none of them is absolutely “better”. Just be careful that if you decide to use the first approach, also create a UNIQUE constraint on {eReportId, requirementId}.

The fist approach (with non-identifying relationship and surrogate key) leads to:

  • “leaner” foreign keys in child tables (which is EReportReqImg in this case) – as you already noted,
  • the cascading ON UPDATE doesn’t propagate to children (so if you update EReport.eReportId, only EReportReq.eReportId is cascade-updated, but not EReportReqImg.eReportId)
  • and can be more friendly to ORMs.

On the other hand, the second approach (with identifying relationship and natural keys):

  • has potentially less need for JOINs (e.g. you don’t need to EReportReqImg JOIN EReportReq just to find-out requirementId – you have it directly in EReportReqImg.requirementId),
  • is better suited for clustered tables (e.g. EReportReq rows with the same eReportId will be stored physically “close”, which may significantly benefit some queries)
  • avoids additional index on the surrogate key.

Since you have a small number of child tables, “fat” FKs don’t matter much and since we are dealing with IDs, they are unlikely to change and cascading ON UPDATE is unlikely to be a problem. So, my instinct is to go with the second approach, but you might have some other considerations that might tip your decision in a different direction…

Leave a Comment