My question is about
EReportReq
table. I’m not sure if I need a column as primary key (EReportReqId
) or I can useeReportId
andrequirementId
as primary key.
You can use either of them – none of them is absolutely “better”. Just be careful that if you decide to use the first approach, also create a UNIQUE constraint on {eReportId, requirementId}
.
The fist approach (with non-identifying relationship and surrogate key) leads to:
- “leaner” foreign keys in child tables (which is
EReportReqImg
in this case) – as you already noted, - the cascading ON UPDATE doesn’t propagate to children (so if you update
EReport.eReportId
, onlyEReportReq.eReportId
is cascade-updated, but notEReportReqImg.eReportId
) - and can be more friendly to ORMs.
On the other hand, the second approach (with identifying relationship and natural keys):
- has potentially less need for JOINs (e.g. you don’t need to
EReportReqImg JOIN EReportReq
just to find-outrequirementId
– you have it directly inEReportReqImg.requirementId
), - is better suited for clustered tables (e.g.
EReportReq
rows with the sameeReportId
will be stored physically “close”, which may significantly benefit some queries) - avoids additional index on the surrogate key.
Since you have a small number of child tables, “fat” FKs don’t matter much and since we are dealing with IDs, they are unlikely to change and cascading ON UPDATE is unlikely to be a problem. So, my instinct is to go with the second approach, but you might have some other considerations that might tip your decision in a different direction…