Why full text search on table with GIN index is still very slow

by AdjunctProfessorFalcon   Last Updated October 19, 2018 06:06 AM

With what I've been able to gather thus far, if you need to run a full text search on a table in a PostgreSQL database with lots of entries (say on the order 1.2M+), the recommended approach is to create an index for that table (in this case we've created a GIN index) which should allow you to run a query like this:

SELECT * FROM speech WHERE speech_tsv @@ plainto_tsquery('a text string')

In addition to the results of this query sometimes not containing anything related search string, it often will take anywhere from 8-10 seconds.

The db is deployed on a fairly sizable, multicore EC2 instance, so I'm thinking, is it possible there's something else we could do to the db to help these queries run faster?

Or this the query execution time about reasonable given the huge amount of files and text we're asking it to search through (even via indexing)?

Here's what the table looks like:

                                     Table "public.speech"
    Column     |            Type             |                      Modifiers                      
---------------+-----------------------------+-----------------------------------------------------
 speech_id     | integer                     | not null default nextval('speech_id_seq'::regclass)
 speechtype_id | smallint                    | not null
 title         | character varying           | not null default ''::character varying
 speechdate    | date                        | default now()
 location      | character varying           | not null default ''::character varying
 source        | character varying           | not null default ''::character varying
 speechtext    | text                        | not null
 url           | character varying           | not null default ''::character varying
 release_id    | smallint                    | 
 created       | timestamp without time zone | 
 modified      | timestamp without time zone | 
 speech_tsv    | tsvector                    | 
 key           | boolean                     | 
 summary       | text                        | 
 quote         | text                        | 
Indexes:
    "speech_pk" PRIMARY KEY, btree (speech_id)
    "speech__release_id" btree (release_id)
    "speech__speech_tsv" gin (speech_tsv)
    "speech__speechdate" btree (speechdate)
    "speech__speechtype_id" btree (speechtype_id)

(speechtext would be the column holding all the text to be searched, obviously)



Related Questions





PostgreSQL: Index Trigger / Hook? For Synonyms

Updated September 14, 2017 17:06 PM