Sql – Join a count query on generate_series() and retrieve Null values as ‘0’

generate-series, join, postgresql, postgresql-9.1, sql

I want to count ID's per month using generate_series(). This query works in PostgreSQL 9.1:

SELECT (to_char(serie,'yyyy-mm')) AS year, sum(amount)::int AS eintraege FROM (    SELECT         COUNT(mytable.id) as amount,          generate_series::date as serie          FROM mytable          RIGHT JOIN generate_series(            (SELECT min(date_from) FROM mytable)::date,          (SELECT max(date_from) FROM mytable)::date,         interval '1 day') ON generate_series = date(date_from)         WHERE version = 1          GROUP BY generate_series              ) AS fooGROUP BY Year   ORDER BY Year ASC;  

This is my output:

"2006-12" | 4  "2007-02" | 1  "2007-03" | 1  

But what I want to get is this output ('0' value in January):

"2006-12" | 4  "2007-01" | 0  "2007-02" | 1  "2007-03" | 1  

Months without id should be listed nevertheless.
Any ideas how to solve this?

Sample data:

drop table if exists mytable;create table mytable(id bigint, version smallint, date_from timestamp);insert into mytable(id, version, date_from) values(4084036, 1, '2006-12-22 22:46:35'),(4084938, 1, '2006-12-23 16:19:13'),(4084938, 2, '2006-12-23 16:20:23'),(4084939, 1, '2006-12-23 16:29:14'),(4084954, 1, '2006-12-23 16:28:28'),(4250653, 1, '2007-02-12 21:58:53'),(4250657, 1, '2007-03-12 21:58:53');

Best Solution

Untangled, simplified and fixed, it might look like this:

SELECT to_char(s.tag,'yyyy-mm') AS monat     , count(t.id) AS eintraegeFROM  (   SELECT generate_series(min(date_from)::date                        , max(date_from)::date                        , interval '1 day'          )::date AS tag   FROM   mytable t   ) sLEFT   JOIN mytable t ON t.date_from::date = s.tag AND t.version = 1   GROUP  BY 1ORDER  BY 1;

db<>fiddle here

Among all the noise, misleading identifiers and unconventional format the actual problem was hidden here:

WHERE version = 1

You made correct use of RIGHT [OUTER] JOIN. But adding a WHERE clause that requires an existing row from mytable converts the RIGHT [OUTER] JOIN to an [INNER] JOIN effectively.

Move that filter into the JOIN condition to make it work.

I simplified some other things while being at it.

Better, yet

SELECT to_char(mon, 'yyyy-mm') AS monat     , COALESCE(t.ct, 0) AS eintraegeFROM  (   SELECT date_trunc('month', date_from)::date AS mon        , count(*) AS ct   FROM   mytable   WHERE  version = 1        GROUP  BY 1   ) tRIGHT JOIN (   SELECT generate_series(date_trunc('month', min(date_from))                        , max(date_from)                        , interval '1 mon')::date   FROM   mytable   ) m(mon) USING (mon)ORDER  BY mon;

db<>fiddle here

It's much cheaper to aggregate first and join later - joining one row per month instead of one row per day.

It's cheaper to base GROUP BY and ORDER BY on the date value instead of the rendered text.

count(*) is a bit faster than count(id), while equivalent in this query.

generate_series() is a bit faster and safer when based on timestamp instead of date. See: