How to get matching data from another SQL table for two different columns: Inner Join and/or Union?

(The following applies when every row is SQL DISTINCT, and outside SQL code similarly treats NULL like just another value.)

Every base table has a statement template, aka predicate, parameterized by column names, by which we put a row in or leave it out. We can use a (standard predicate logic) shorthand for the predicate that is like its SQL declaration.

-- facilitator [facilID] is named [facilFname] [facilLname]
facilitator(facilID, facilLname, facilFname)
-- class [classID] named [className] has prime [primeFacil] & backup [secondFacil]
class(classID, className, primeFacil, secondFacil)

Plugging a row into a predicate gives a statement aka proposition. The rows that make a true proposition go in a table and the rows that make a false proposition stay out. (So a table states the proposition of each present row and states NOT the proposition of each absent row.)

-- facilitator f1 is named Jane Doe
facilitator(f1, 'Jane', 'Doe')
-- class c1 named CSC101 has prime f1 & backup f8
class(c1, 'CSC101', f1, f8)

But every table expression value has a predicate per its expression. SQL is designed so that if tables T and U hold the (NULL-free non-duplicate) rows where T(…) and U(…) (respectively) then:

  • T CROSS JOIN U holds rows where T(…) AND U(…)
  • T INNER JOIN U ONcondition holds rows where T(…) AND U(…) AND condition
  • T LEFT JOIN U ONcondition holds rows where (for U-only columns U1,…)
        T(…) AND U(…) AND condition
    OR T(…)
        AND NOT there EXISTS values for U1,… where [U(…) AND condition]
        AND U1 IS NULL AND …
  • T WHEREcondition holds rows where T(…) AND condition
  • T INTERSECT U holds rows where T(…) AND U(…)
  • T UNION U holds rows where T(…) OR U(…)
  • T EXCEPT U holds rows where T(…) AND NOT U(…)
  • SELECT DISTINCT * FROM T holds rows where T(…)
  • SELECT DISTINCTcolumns to keepFROM T holds rows where
    there EXISTS values for columns to drop where T(…)
  • VALUES (C1, C2, ...)((v1,v2, ...), ...) holds rows where
    C1 = v1 AND C2 = v2 AND … OR …

Also:

  • (...) IN T means T(…)
  • scalar= T means T(scalar)
  • T(…, X, …) AND X = Y means T(…, Y, …) AND X = Y

So to query we find a way of phrasing the predicate for the rows that we want in natural language using base table predicates, then in shorthand using base table predicates, then in shorthand using aliases in column names except for output columns, then in SQL using base table names plus ON & WHERE conditions etc. If we need to mention a base table twice then we give it aliases.

-- natural language
there EXISTS values for classID, primeFacil & secondFacil where
    class [classID] named [className]
        has prime [primeFacil] & backup [secondFacil]
AND facilitator [primeFacil] is named [pf.facilFname] [pf.facilLname]
AND facilitator [secondFacil] is named [sf.facilFname] [sf.facilLname]

-- shorthand
there EXISTS values for classID, primeFacil & secondFacil where
    class(classID, className, primeFacil, secondFacil)
AND facilitator(pf.facilID, pf.facilLname, pf.facilFname)
AND pf.facilID = primeFacil
AND facilitator(sf.facilID, sf.facilLname, sf.facilFname)
AND sf.facilID = secondFacil

-- shorthand using aliases everywhere but result
-- use # to distinguish same-named result columns in specification
there EXISTS values for c.*, pf.*, sf.* where
    className = c.className
AND facilLname#1 = pf.facilLname AND facilFname#1 = pf.facilFname
AND facilLname#2 = sf.facilLname AND facilFname#2 = sf.facilFname
AND class(c.classID, c.className, c.primeFacil, c.secondFacil)
AND facilitator(pf.facilID, pf.facilLname, pf.facilFname)
AND pf.facilID = c.primeFacil
AND facilitator(sf.facilID, sf.facilLname, sf.facilFname)
AND sf.facilID = c.secondFacil

-- table names & SQL (with MS Access parentheses)
SELECT className, pf.facilLname, pf.facilFname, sf.facilLname, sf.facilFname
FROM (class JOIN facilitator AS pf ON pf.facilID = primeFacil)
JOIN facilitator AS sf ON sf.facilID = secondFacil

OUTER JOIN would be used when a class doesn’t always have both facilitators or something doesn’t always have all names. (Ie if a column can be NULL.) But you haven’t given the specific predicates for your base table and query or the business rules about when things might be NULL so I have assumed no NULLs.

Is there any rule of thumb to construct SQL query from a human-readable description?

(Re MS Access JOIN parentheses see this from SO and this from MS.)

Leave a Comment