I have three tables, lets say their names are:
table1(has total over 40 million rows, but is partitioned with Declarative partition https://www.postgresql.org/docs/10/static/ddl-partitioning.html based on timestamp (1 month per table ~5 million rows in one partitioned table)
table2has ~ 200k rows
table3has ~ 1000 rows
My query is as follows:
SELECT table1.timestamp::DATE, table2.segment, COUNT(table1.timestamp::DATE) as count, ABS(SUM(table1.amount)) as sum, table3.category, table3.text FROM table1 LEFT JOIN table3 ON table1.id = table3.id LEFT JOIN table2 ON table1.hash = table2.hash WHERE table1.timestamp::DATE >= '2015-01-01' AND table1.time::DATE < '2015-02-01' AND table2.segment IN (1,2) GROUP BY (table1.timestamp::DATE, table2.segment, table3.category, table3.text) ORDER BY table1.timestamp::DATE, sum DESC;
How it would be best to optimize the query? I have indexes on the following columns:
The query takes over 2 minutes to complete currently. I had a
table1 with 5 million rows before and this query was running under 5 seconds. But when it grew to 40 million it got slow. So I partitioned it and would have assumed with the indexes and
table1 partitioned into 5 million rows partition tables the performance would increase, but nothing. It is like there is no difference if the table is partitioned or not.
To my knowledge
WHERE clause should be executing first in limiting the results so that the
JOIN operations should only join already limited amount of rows. But this does not seem to happen?