Monday, July 25

Function based indexes in Postgres

I actually found something that PostgreSQL does better than Oracle (or at least they
take a different approach that proves quite good in our use case).

Lets say w have a table with ticket numbers and most of the queries run against the table is to determine if a ticket is valid or not and the condition is that VALID='Y' and that REDEEM_DATE is not set. Simple, lets create an index for just those conditions. I'm not claiming this can't be done in Oracle, but to be honest, the approach in Postgres is so much simpler for such a basic scenario.

Example DDL
CREATE TABLE tickets (
ID serial,
TICKET_CODE char(10),
ISSUE_DATE timestamp,
VALID char(1),
REDEEM_DATE timestamp,
CAMPAIGN_ID int
);

[ insert data (90.000 rows in this case) ]

CREATE INDEX tickets_valid_idx ON tickets (redeem_date,valid)
WHERE valid='Y' AND redeem_date is null;

ANALYZE TICKETS;

Ok, lets see the execution plans (hail GEQO).
Compare the cost figures for the first (heavily used) query with the negation of the second query (which is never run except in this test case).
testdb=> explain select count(*) from tickets where 
VALID='Y' and redeem_date is null;


QUERY PLAN
-------------------------------------------------------
Aggregate (cost=192.91..192.91 rows=1 width=0)
-> Index Scan using tickets_valid_idx on tickets
(cost=0.00..192.79 rows=48 width=0)
Filter: (("valid" = 'Y'::bpchar)
AND (redeem_date IS NULL))
(3 rows)

testdb=> explain select count(*) from tickets where
VALID='N' and redeem_date is null;


QUERY PLAN
-------------------------------------------------------
Aggregate (cost=2715.14..2715.14 rows=1 width=0)
-> Seq Scan on tickets
(cost=0.00..2714.12 rows=403 width=0)
Filter: (("valid" = 'N'::bpchar)
AND (redeem_date IS NULL))
(3 rows)
The downside is of course as with all indexes that updates and insert to the table will be a bit slower. We do bulk inserts so that's not a huge problem. One concern tho is that when a ticket is used we set VALID to U and redeem_date to now() so that the row in question must be removed from the index, this could lead to quite fragmented indexes. Might be worth keeping an eye on it for the first few months.

1 comment:

KajMagnus said...

Thanks for your post. I found it useful, when converting my database from Oracle to PG.

Do you know how to create function based indexes in Oracle? I suppose there's *someone* who don't know. So here is an example of a function based index, in Oracle, that indexes only certain rows. It's a unique index that I use to ensure there's only one canonical host, per tenant, in my table of tenants and hosts.

-- Make sure there's only one canonical host per tenant.
create unique index DW1_TNTHSTS_TENANT_CNCL__U on DW1_TENANT_HOSTS(
case when CANONICAL = 'T' then TENANT else NULL end,
case when CANONICAL = 'T' then 'T' else NULL end);

Here is the same index in Postgre SQL, I agree Postgre SQL is simpler in this case:

create unique index DW1_TNTHSTS_TENANT_CNCL__U on DW1_TENANT_HOSTS(TENANT)
where CANONICAL = 'T';

Best regards, KajMagnus