In a StackOverflow clone, what relationship should a Comments table have to Questions and Answers?

I’d go with the Posts approach. This is the best way to ensure referential integrity.

If you need additional columns for Answers and Questions respectively, put them in additional tables with a one-to-one relationship with Posts.

For example, in MySQL syntax:

CREATE TABLE Posts (
  post_id     SERIAL PRIMARY KEY,
  post_type   CHAR(1),              -- must be 'Q' or 'A'
  -- other columns common to both types of Post
  UNIQUE KEY (post_id, post_type) -- to support foreign keys
) ENGINE=InnoDB;

CREATE TABLE Comments (
  comment_id  SERIAL PRIMARY KEY, 
  post_id     BIGINT UNSIGNED NOT NULL,
  -- other columns for comments (e.g. date, who, text)
  FOREIGN KEY (post_id) REFERENCES Posts(post_id)
) ENGINE=InnoDB; 

CREATE TABLE Questions (
  post_id     BIGINT UNSIGNED PRIMARY KEY,
  post_type   CHAR(1),              -- must be 'Q'
  -- other columns specific to Questions
  FOREIGN KEY (post_id, post_type) REFERENCES Posts(post_id, post_type)
) ENGINE=InnoDB;

CREATE TABLE Answers (
  post_id     BIGINT UNSIGNED PRIMARY KEY,
  post_type   CHAR(1),              -- must be 'A'
  question_id BIGINT UNSIGNED NOT NULL,
  -- other columns specific to Answers
  FOREIGN KEY (post_id, post_type) REFERENCES Posts(post_id, post_type)
  FOREIGN KEY (question_id) REFERENCES Questions(post_id)
) ENGINE=InnoDB;

This is called Class Table Inheritance. There’s a nice overview of modeling inheritance with SQL in this article: “Inheritance in relational databases.”

It can be helpful to use post_type so a given Post can be only one answer or one question. You don’t want both an Answer and a Question to reference one given Post. So this is the purpose of the post_type column above. You can use CHECK constraints to enforce the values in post_type, or else use a trigger if your database doesn’t support CHECK constraints.

I also did a presentation that may help you. The slides are up at http://www.slideshare.net/billkarwin/sql-antipatterns-strike-back. You should read the sections on Polymorphic Associations and Entity-Attribute-Value.


If you use Single Table Inheritance, as you said you’re using Ruby on Rails, then the SQL DDL would look like this:

CREATE TABLE Posts (
  post_id     SERIAL PRIMARY KEY,
  post_type   CHAR(1),              -- must be 'Q' or 'A'
  -- other columns for both types of Post
  -- Question-specific columns are NULL for Answers, and vice versa.
) ENGINE=InnoDB;

CREATE TABLE Comments (
  comment_id  SERIAL PRIMARY KEY, 
  post_id     BIGINT UNSIGNED NOT NULL,
  -- other columns for comments (e.g. date, who, text)
  FOREIGN KEY (post_id) REFERENCES Posts(post_id)
) ENGINE=InnoDB; 

You can use a foreign key constraint in this example, and I recommend that you do! 🙂

Rails philosophy tends to favor putting enforcement of the data model into the application layer. But without constraints enforcing integrity at in the database, you have the risk that bugs in your application, or ad hoc queries from a query tool, can harm data integrity.

Leave a Comment