It's me again. Sorry, as you can guess, I'm very new at this, but I'm going to try and give you all the info at once this time.
I have a table called stats, where I store every action on a certain website (basically when the user clicks a link, data is saved in the db). The data is modelled with postgres as
( id integer NOT NULL DEFAULT nextval('stats_id_seq'::regclass), company integer NOT NULL, layout integer NOT NULL, browsername character varying(50) COLLATE pg_catalog."default", lang character varying(5) COLLATE pg_catalog."default" NOT NULL, ...)
there are other fields that I don't think are relevant. Basically what I want to do is see which are the most used browsers for a certain company, by counting how many pages are visited by that browser and how many users there are for a given browser.
My query is:
select browsername, count(DISTINCT phpsessid) as visitors, count(*) as pages from public.stats where company=1 and createdate>'2017-01-01' and createdate<'2018-12-12' group by browsername order by visitors desc;
the stats table contains around 80 millions rows. Now, this query is extremely fast if I set company=1 or company=222 (it's the company ID). It takes less than half a second to fetch around 20k rows. However, it is EXTREMELY slow if I set company=13549, for instance (we're talking about literal hours here). Obviously something is wrong, either in the data modelling or in the way I query.
How come there's such a difference for different companies? The DB was not done by me so I apologize if I left something useful out, and feel free to ask.
The indexes are:
CREATE INDEX stats_new_company_13549_index ON public.stats USING btree (company, createdate) TABLESPACE pg_default WHERE company = 13549 CREATE INDEX stats_new_company_14863_index ON public.stats USING btree (company, createdate) TABLESPACE pg_default WHERE company = 14863 CREATE INDEX stats_new_company_createdate_cet_index ON public.stats USING btree (company, date(timezone('CET'::text, createdate))) TABLESPACE pg_default; CREATE INDEX stats_new_company_createdate_index_1 ON public.stats USING btree (company, createdate) TABLESPACE pg_default;
Here's the plan for the fast query: https://explain.depesz.com/s/uKmJ
And here's the plan for the slow one: https://explain.depesz.com/s/wysA
Just by looking at the plans you can see it took several minutes for the second one to even query the explain.
I also noticed that after running the explain a couple of times, it got done in reasonable amounts the third time. From that moment onwards, the query would also drastically reduce its execution time, from hours to a mere 4 seconds. It's the second time it happens, and I swear I'm not crazy. If I change the company ID once again, the query takes hours again. I'm at lost here: is there some index problem?
The fast query retrieves and sorts 26229 rows - for that small number of rows, the sorting can be done in memory, so obviously this is going to be quick. First because retrieving the data only takes ~500ms and then the sorting is done in 50ms.
The slow query retrieves 560135 rows (20 times as many as the first query) but the time it wook - 149939ms - seems quite slow. Maybe your table (or index) is bloated - the number of blocks needed to read that number of rows is way too high I think.
You can run
vacuum full analyze public.stats; and see if performance gets better after that.
Or maybe you simply have a very slow harddisk.
The sorting was done on disk, but that only added another 3 seconds to the total runtime.
The question why the same query is sometimes fast and sometimes quick (especially when run the second time) is more often that not answered with: caching effects. When you re-run the second query you will probably see a lot of the "shared read=318147" information turn to "shared hit=..." which means those blocks were already in the cache.