Sum results of a few queries and then find top 5 in SQL

To UNION the resulting rows of all three queries and then pick the 5 rows with the highest amount:

(SELECT event_id, count(*) AS amount
FROM   pageview 
GROUP  BY event_id
ORDER  BY pageviews DESC, rand()
LIMIT  1000)

UNION ALL
(SELECT event_id, count(*)
FROM   upvote
GROUP  BY event_id
ORDER  BY upvotes DESC, rand()
LIMIT  1000)

UNION ALL
(SELECT event_id, count(*)
FROM   attending
GROUP  BY event_id
ORDER  BY attendants DESC, rand()
LIMIT  1000)

ORDER  BY 2 DESC
LIMIT  5;

The manual:

To apply ORDER BY or LIMIT to an individual SELECT, place the
clause inside the parentheses that enclose the SELECT.

UNION ALL to keep duplicates.


To add the counts for every event_id:

SELECT event_id, sum(amount) AS total
FROM (
   (SELECT event_id, count(*) AS amount
    FROM   pageview 
    GROUP  BY event_id
    ORDER  BY pageviews DESC, rand()
    LIMIT  1000)
    
    UNION ALL
    (SELECT event_id, count(*)
    FROM   upvote
    GROUP  BY event_id
    ORDER  BY upvotes DESC, rand()
    LIMIT  1000)
    
    UNION ALL
    (SELECT event_id, count(*)
    FROM   attending
    GROUP  BY event_id
    ORDER  BY attendants DESC, rand()
    LIMIT  1000)
    ) x
GROUP  BY 1
ORDER  BY sum(amount) DESC
LIMIT  5;

The tricky part here is that not every event_id will be present in all three base queries. So take care that a JOIN does not lose rows completely and additions don’t turn out NULL.

Use UNION ALL, not UNION. You don’t want to remove identical rows, you want to add them up.

x is a table alias and shorthand for AS x. It is required for for a subquery to have a name. Can be any other name here.

The SOL feature FULL OUTER JOIN is not implemented in MySQL (last time I checked), so you have to make do with UNION. FULL OUTER JOIN would join all three base queries without losing rows.

Answer to follow-up question

SELECT event_id, sum(amount) AS total
FROM (
   (SELECT event_id, count(*) / 100 AS amount
    FROM   pageview ... )
    
    UNION ALL
    (SELECT event_id, count(*) * 5 
    FROM   upvote ... )
    
    UNION ALL
    (SELECT event_id, count(*) * 10
    FROM   attending ... )
    ) x
GROUP  BY 1
ORDER  BY  sum(amount) DESC
LIMIT  5;

Or, to use the base counts in multiple ways:

SELECT event_id
      ,sum(CASE source
              WHEN 'p' THEN amount / 100
              WHEN 'u' THEN amount * 5
              WHEN 'a' THEN amount * 10
              ELSE 0
           END)  AS total
FROM (
   (SELECT event_id, 'p'::text AS source, count(*) AS amount
    FROM   pageview ... )
    
    UNION ALL
    (SELECT event_id, 'u'::text, count(*)
    FROM   upvote ... )
    
    UNION ALL
    (SELECT event_id, 'a'::text, count(*)
    FROM   attending ... )
    ) x
GROUP  BY 1
ORDER  BY 2 DESC
LIMIT  5;

Leave a Comment