- Notifications
You must be signed in to change notification settings - Fork67
Partitioning by expression
As of release 1.4, pg_pathman supports partitioning by expression (both RANGE and HASH partitioning schemes).
/* we're going to store something like { "key": 100, "value": "abcdef...." }*/createtabletest(col jsonbnot null);/* create 10 partitions, each will contain 10.000 unique keys*/select create_range_partitions('test','(col->>''key'')::int8',1,10000,10);/* insert some data*/insert into testselect format('{"key": %s, "date": "%s", "value": "%s"}', i,current_date, md5(i::text))::jsonbfrom generate_series(1,10000*10)as g(i);
Now that we've created a table, let's look at the contents ofpathman_config
:
select partrel, expr, parttype, range_intervalfrom pathman_config; partrel | expr | parttype | range_interval---------+---------------------------------+----------+---------------- test | ((col->>'key'::text))::bigint |2 |10000(1 row)
As you can see, theexpr
column contains a textual representation of our partitioning expression. The expression's AST is also stored inpathman_config
, but it's not supposed to be human-readable:
select cooked_exprfrom pathman_config;-[ RECORD1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------cooked_expr | {COERCEVIAIO :arg {OPEXPR :opno3477 :opfuncid3214 :opresulttype25 :opretset false :opcollid100 :inputcollid100 :args ({VAR :varno1 :varattno1 :vartype3802 :vartypmod-1 :varcollid0 :varlevelsup0 :varnoold1 :varoattno1 :location9} {CONST :consttype25 :consttypmod-1 :constcollid100 :constlen-1 :constbyval false :constisnull false :location15 :constvalue7 [28000107101121 ]}) :location12} :resulttype20 :resultcollid0 :coerceformat1 :location21}
Now let's take a glance at partitions:
table pathman_partition_list ; parent | partition | parttype | expr | range_min | range_max--------+-----------+----------+---------------------------------+-----------+----------- test | test_1 |2 | ((col->>'key'::text))::bigint |1 |10001 test | test_2 |2 | ((col->>'key'::text))::bigint |10001 |20001 test | test_3 |2 | ((col->>'key'::text))::bigint |20001 |30001 test | test_4 |2 | ((col->>'key'::text))::bigint |30001 |40001 test | test_5 |2 | ((col->>'key'::text))::bigint |40001 |50001 test | test_6 |2 | ((col->>'key'::text))::bigint |50001 |60001 test | test_7 |2 | ((col->>'key'::text))::bigint |60001 |70001 test | test_8 |2 | ((col->>'key'::text))::bigint |70001 |80001 test | test_9 |2 | ((col->>'key'::text))::bigint |80001 |90001 test | test_10 |2 | ((col->>'key'::text))::bigint |90001 |100001(10 rows)
Each partition has a tricky check constraint that stores partition's configuration (bounds / hash + number):
select conname, pg_get_constraintdef(oid) condeffrom pg_constraintwhere conrelid='test_1'::regclass;-[ RECORD1 ]---------------------------------------------------------------------------------------------------------------conname | pathman_test_1_checkcondef |CHECK (((((col->>'key'::text))::bigint>='1'::bigint)AND (((col->>'key'::text))::bigint<'10001'::bigint)))
It's relatively easy to figure out whether pg_pathman is going to optimize a query... provided that expression consists of a single column:
createtableabc(valintnot null);select create_hash_partitions('abc','val',5);/* partition by 'val'*/explainselect*from abcwhere val=1; QUERY PLAN------------------------------------------------------------- Append (cost=0.00..41.88 rows=13 width=4)-> Seq Scanon abc_0 (cost=0.00..41.88 rows=13 width=4) Filter: (val=1)(3 rows)
Things get much more complicated once you use an intricate composite expression (like the one we've introduced in previous sections). pg_pathman won't be able to optimize a query unless we reference theexact same expression we used to create partitions:
explainselect*from testwhere ((col->>'key'::text))::int=4; QUERY PLAN----------------------------------------------------------------- Append (cost=0.00..6620.00 rows=500 width=32)-> Seq Scanon test_1 (cost=0.00..662.00 rows=50 width=91) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_2 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_3 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_4 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_5 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_6 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_7 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_8 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_9 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)-> Seq Scanon test_10 (cost=0.00..662.00 rows=50 width=92) Filter: (((col->>'key'::text))::integer=4)(21 rows)
Something is wrong, isn't int? Looks like we used a wrong expression:
select exprfrom pathman_configwhere partrel='test'::regclass; expr--------------------------------- ((col->>'key'::text))::bigint/* we used ((col ->> 'key'::text))::integer*/(1 row)
Everything works just as expected once we change the expression:
explainselect*from testwhere ((col->>'key'::text))::bigint=4; QUERY PLAN---------------------------------------------------------------- Append (cost=0.00..662.00 rows=50 width=32)-> Seq Scanon test_1 (cost=0.00..662.00 rows=50 width=91) Filter: (((col->>'key'::text))::bigint=4)(3 rows)
There are several limitations:
- Expression must reference at least one column of partitioned table;
- All referenced columns must be marked NOT NULL;
- Expression may not reference system attributes (oid, xmin, xmax etc);
- Expression may not include sub-queries;
- All functions used by expression must be marked IMMUTABLE;