Join a count query on generate_series() and retrieve Null values as '0'

Untangled, simplified and fixed, it might look like this:

SELECT to_char(s.tag,'yyyy-mm') AS monat
     , count(t.id) AS eintraege
FROM  (
   SELECT generate_series(min(date_from)::date
                        , max(date_from)::date
                        , interval '1 day'
          )::date AS tag
   FROM   mytable t
   ) s
LEFT   JOIN mytable t ON t.date_from::date = s.tag AND t.version = 1   
GROUP  BY 1
ORDER  BY 1;

db<>fiddle here

Among all the noise, misleading identifiers and unconventional format the actual problem was hidden here:

WHERE version = 1

You made correct use of RIGHT [OUTER] JOIN. But adding a WHERE clause that requires an existing row from mytable converts the RIGHT [OUTER] JOIN to an [INNER] JOIN effectively.

Move that filter into the JOIN condition to make it work.

I simplified some other things while being at it.

Better, yet

SELECT to_char(mon, 'yyyy-mm') AS monat
     , COALESCE(t.ct, 0) AS eintraege
FROM  (
   SELECT date_trunc('month', date_from)::date AS mon
        , count(*) AS ct
   FROM   mytable
   WHERE  version = 1     
   GROUP  BY 1
   ) t
RIGHT JOIN (
   SELECT generate_series(date_trunc('month', min(date_from))
                        , max(date_from)
                        , interval '1 mon')::date
   FROM   mytable
   ) m(mon) USING (mon)
ORDER  BY mon;

db<>fiddle here

It’s much cheaper to aggregate first and join later – joining one row per month instead of one row per day.

It’s cheaper to base GROUP BY and ORDER BY on the date value instead of the rendered text.

count(*) is a bit faster than count(id), while equivalent in this query.

generate_series() is a bit faster and safer when based on timestamp instead of date. See:

Generating time series between two dates in PostgreSQL

Join a count query on generate_series() and retrieve Null values as ‘0’

Better, yet

Leave a Comment Cancel reply

Better, yet

More Related Contents:

Leave a Comment Cancel reply