Movatterモバイル変換


[0]ホーム

URL:



Facebook
Postgres Pro
Facebook
Downloads
F.37. pg_pathman
Prev UpAppendix F. Additional Supplied Modules and Extensions Shipped inpostgrespro-std-14-contribHome Next

F.37. pg_pathman

Important

Starting fromPostgres Pro 12, usingpg_pathman is not recommended. Use vanilla declarative partitioning instead, as described inSection 5.11.

Thepg_pathman is aPostgres Pro extension that provides an optimized partitioning solution for large and distributed databases. Usingpg_pathman, you can:

  • Partition large databases without downtime.

  • Speed up query execution for partitioned tables.

  • Manage existing partitions and add new partitions on the fly.

  • Add foreign tables as partitions.

  • Join partitioned tables for read and write operations.

The extension is compatible withPostgres Pro 9.5 or higher.

F.37.1. Installation and Setup

Thepg_pathman extension is included into thePostgres Pro. Once you havePostgres Pro installed, complete the following steps to enablepg_pathman:

  1. Addpg_pathman to theshared_preload_libraries variable in thepostgresql.conf file:

    shared_preload_libraries = 'pg_pathman'

    Important

    pg_pathman may have conflicts with other extensions that use the same hook functions. For example,pg_pathman may interfere with thepg_stat_statements extension as they both useProcessUtility_hook. To avoid such issues,pg_pathman must always be the last in the list of libraries:shared_preload_libraries = 'pg_stat_statements, pg_pathman'

  2. Restart thePostgres Pro instance for the settings to take effect.

  3. Create thepg_pathman extension as follows:

    CREATE SCHEMA pathman;GRANT USAGE ON SCHEMA pathman TO PUBLIC;CREATE EXTENSION pg_pathman WITH SCHEMA pathman;

    Important

    To ensure that your calls topg_pathman's functions are always secure againstsearch_path-based attacks (seeCREATE EXTENSION for details), install it only into a clean schema where nobody except superusers has theCREATE privilege for database objects.

Oncepg_pathman is enabled, you can start partitioning tables.

Note

During installation,pg_pathman creates a few RLS policies to restrict access to its own tables.Postgres Pro core, however, does not support dump/restore of databases where extensions issuingCREATE POLICY statements are installed. Therefore, when restoring a dump of a database wherepg_pathman is installed, you will get error messages such as:

ERROR: policy "allow_select" for table "pathman_config" already exists

Ignore them since they do not affect whether the data being restored is complete.

Tip

You can also buildpg_pathman from source code by executing the following command in thepg_pathman directory:

make install USE_PGXS=1

When this operation is complete, follow the steps described above to complete the setup.

In addition, do not forget to set thePG_CONFIG variable if you want to testpg_pathman on a custom build ofPostgres Pro. For details, seeBuilding and Installing PostgreSQL Extension Modules.

You can togglepg_pathman or its specific custom nodes on and off usingGUC variables. For details, seeSection F.37.5.1.

If you want to permanently disablepg_pathman for a previously partitioned table, use thedisable_pathman_for() function:

SELECT disable_pathman_for('range_rel');

All sections and data will remain unchanged and will be handled by the standard Postgres Pro inheritance mechanism.

F.37.1.1. Updating pg_pathman

If you already have a previous version ofpg_pathman installed, complete the following steps to upgrade to a newer version:

  1. InstallPostgres Pro.

  2. Restart yourPostgres Pro cluster.

  3. If you are running a previous major version ofpg_pathman (the second digit in the version number is different), complete the update as follows:

    ALTER EXTENSION pg_pathman UPDATE TOversion;SET pg_pathman.enable = t;

    whereversion is thepg_pathman major version number, such as 1.5.

    You can check the currentpg_pathman version by running thepathman_version() function.

F.37.2. Usage

Choosing Partitioning Strategies

Running Non-Blocking Data Migration

Partitioning by a Single Expression

Partitioning by Composite Key

Running Multilevel Partitioning

Managing Partitions

As your database grows, indexing mechanisms may become inefficient and cause high latency as you run queries. To improve performance, ensure scalability, and optimize database administration processes you can use partitioning — splitting a large table into smaller pieces, with each row moved to a single partition according to the partitioning key.

Traditionally,Postgres Pro has supported partitioning via table inheritance, with each partition created as a child table with a CHECK constraint. InPostgres Pro 10, support fordeclarative partitioning was added, which also relies on inheritance. With these approaches, the query planner has to perform an exhaustive search and check constraints on each partition to build a query plan, which may slow down queries for tables with a large number of partitions. Thepg_pathman extension uses an optimized planning algorithms and partitioning functions based on the internal structure of the partitioned tables, which allows to achieve better performance results. For details onpg_pathman implementation specifics, seeSection F.37.4.

F.37.2.1. Choosing Partitioning Strategies

Thepg_pathman extension supports the following partitioning strategies:

  • Hash — maps rows to partitions using a generic hash function. Choose this strategy if most of your queries will be of the exact-match type.

  • Range — maps rows to partitions based on partitioning key ranges assigned to each partition. Choose this strategy if your database contains numeric data that you are likely to query or manage by ranges. For example, you may want to query historical data by years, or review experiment results by specific numeric ranges. To achieve performance gains,pg_pathman uses the binary search algorithm.

By default,pg_pathman migrates all data from the parent table to the newly created partitions at once (blocking partitioning). This approach enables you to restructure the table in a single transaction, but may cause downtime if you have a lot of data. If it is critical to avoid downtime, you can useconcurrent partitioning. In this case,pg_pathman writes all the updates to the newly created partitions, but keeps the original data in the parent table until you explicitly migrate it. This enables you to partition large databases without downtime, as you can choose convenient time for migration and copy data in small batches without blocking other transactions. For details on concurrent partitioning, seeSection F.37.2.2.

F.37.2.1.1. Setting up Hash Partitioning

To perform hash partitioning withpg_pathman, run thecreate_hash_partitions() function:

create_hash_partitions(parent_relid     REGCLASS,                       expression       TEXT,                       partitions_count INTEGER,                       partition_data   BOOLEAN DEFAULT TRUE,                       partition_names  TEXT[] DEFAULT NULL,                       tablespaces      TEXT[] DEFAULT NULL)

Thepg_pathman module creates the specified number of partitions based on the hash function. Optionally, you can specify partition names and tablespaces by settingpartition_names andtablespaces options, respectively.

You cannot add or remove partitions after the parent table is split. If required, you can replace the specified partition with another table:

replace_hash_partition(old_partition       REGCLASS,                       new_partition       REGCLASS,                       lock_parent         BOOL DEFAULT TRUE);

When set totrue,lock_parent parameter will prevent anyINSERT/UPDATE/ALTER TABLE queries to parent table.

If you omit the optionalpartition_data parameter or set it totrue, all the data from the parent table gets migrated to partitions. Thepg_pathman module blocks the table for other transactions until data migration completes. To avoid downtime, you can set thepartition_data parameter tofalse and later use thepartition_table_concurrently() function to migrate your data to partitions without blocking other queries. For details, see theSection F.37.2.2.

F.37.2.1.2. Setting up Range Partitioning

Thepg_pathman module provides thecreate_range_partitions() for range partitioning. This function creates partitions based on the specified interval and the initial partitioning key value. New partitions are created automatically when you insert data outside of the already covered range.

create_range_partitions(parent_relid   REGCLASS,                        expression     TEXT,                        start_value    ANYELEMENT,                        p_interval     ANYELEMENT | INTERVAL,                        p_count        INTEGER DEFAULT NULL,                        partition_data BOOLEAN DEFAULT TRUE)

Thepg_pathman module creates partitions based on the specified parameters. If you omit the optionalp_count parameter,pg_pathman calculates the required number of partitions based on the specified start value and interval. If you insert new data outside of the existing partition range,pg_pathman creates new partitions automatically, keeping the specified interval. This approach ensures that all partitions are of the same size, which can improve query performance and facilitate database management.

Alternatively, you can specify an array defining the bounds of partitions to be created using thebounds parameter:

create_range_partitions(parent_relid    REGCLASS,                        expression      TEXT,                        bounds          ANYARRAY,                        partition_names TEXT[] DEFAULT NULL,                        tablespaces     TEXT[] DEFAULT NULL,                        partition_data  BOOLEAN DEFAULT TRUE)

If required, you can also usepartition management functions to add partitions manually. For example, if there is a gap between the created partitions,pg_pathman cannot fill it with a new partition in an automated mode.

By default, all the data from the parent table gets migrated to the specified number of partitions. Thepg_pathman module blocks the table for other transactions until data migration completes. To avoid downtime, you can set thepartition_data parameter tofalse and later use thepartition_table_concurrently() function to migrate your data to partitions without blocking other queries. For details, see theSection F.37.2.2.

F.37.2.2. Running Non-Blocking Data Migration

If it is critical to avoid downtime, you can perform concurrent partitioning by setting thepartition_data parameter of the partitioning function tofalse. In this case,pg_pathman creates empty partitions, keeping all the original data in the parent table. At the same time, all the database updates are written to the newly created partitions. You can later migrate the original data to partitions without blocking other queries using thepartition_table_concurrently() function:

partition_table_concurrently(relation   REGCLASS,                             batch_size INTEGER DEFAULT 1000,                             sleep_time FLOAT8 DEFAULT 1.0)

where:

  • relation is the parent table.

  • batch_size is the number of rows to copy from the parent table to partitions at a time. You can set this parameter to any integer value from 1 to 10000.

  • sleep_time is the time interval between migration attempts, in seconds.

Thepg_pathman module starts a background worker to move the data from the parent table to partitions in small batches of the specifiedbatch_size. If one or more rows in the batch are locked by other queries,pg_pathman waits for the specifiedsleep_time and tries again, up to 60 times. You can monitor the migration process in thepathman_concurrent_part_tasks view that shows the number of rows migrated so far:

[user]postgres: select * from pathman_concurrent_part_tasks ; userid |  pid  | dbid  | relid | processed | status--------+-------+-------+-------+-----------+--------- user   | 20012 | 12413 | test  |    334000 | working(1 row)

If you need to stop data migration, run thestop_concurrent_part_task() function at any time:

SELECT stop_concurrent_part_task(relation REGCLASS);

pg_pathman completes the migration of the current batch and terminates the migration process.

Tip

Whenpg_pathman migrates all the data from the parent table, you can exclude the parent table from the query plan. See theset_enable_parent() function description for details.

F.37.2.3. Partitioning by a Single Expression

For both range and hash partitioning strategies,pg_pathman supports partitioning by expression that returns a single scalar value. The partitioning expression can reference a table column, as well as calculate the partitioning key based on one or more column values.

Tip

If you would like to partition a table by a tuple, seeSection F.37.2.4.

To partition a table by expression, usepg_pathmanpartitioning functions. The partitioning expression must satisfy the following conditions:

  • Expression must reference at least one column of the partitioned table.

  • All referenced columns must be marked asNOT NULL.

  • Expression cannot reference system attributes, such asoid,xmin,xmax, etc.

  • Expression cannot include subqueries.

  • All functions used by expression must be marked asIMMUTABLE.

As the expression can return a value of virtually any type, make sure to convert it to the type you need for partitioning.

To access a partition, you must use the exact expression used for partitioning. Otherwise,pg_pathman cannot optimize the query. You can view the partitioning expression for each partitioned table in thepathman_config table.

F.37.2.3.1. Examples

Suppose you have thetest table that stores somejsonb data:

CREATE TABLE test(col jsonb NOT NULL);INSERT INTO testSELECT format('{"key": %s, "date": "%s", "value": "%s"}',              i, current_date, md5(i::text))::jsonbFROM generate_series(1, 10000 * 10) as g(i);

To partition this data by range of thekey value, you need to extract this value from thejsonb object and convert it to a numeric type, such asbigint:

SELECT create_range_partitions('test', '(col->>''key'')::bigint', 1, 10000, 10);

pg_pathman splits the parent table into ten partitions, with each partition storing 10000 rows:

SELECT * FROM pathman_partition_list; parent | partition | parttype |              expr               | range_min | range_max--------+-----------+----------+---------------------------------+-----------+----------- test   | test_1    |        2 | ((col ->> 'key'::text))::bigint | 1         | 10001 test   | test_2    |        2 | ((col ->> 'key'::text))::bigint | 10001     | 20001 test   | test_3    |        2 | ((col ->> 'key'::text))::bigint | 20001     | 30001 test   | test_4    |        2 | ((col ->> 'key'::text))::bigint | 30001     | 40001 test   | test_5    |        2 | ((col ->> 'key'::text))::bigint | 40001     | 50001 test   | test_6    |        2 | ((col ->> 'key'::text))::bigint | 50001     | 60001 test   | test_7    |        2 | ((col ->> 'key'::text))::bigint | 60001     | 70001 test   | test_8    |        2 | ((col ->> 'key'::text))::bigint | 70001     | 80001 test   | test_9    |        2 | ((col ->> 'key'::text))::bigint | 80001     | 90001 test   | test_10   |        2 | ((col ->> 'key'::text))::bigint | 90001     | 100001(10 rows)

F.37.2.4. Partitioning by Composite Key

Usingpg_pathman, you can also perform range partitioning by composite key. A composite key consists of two or more comma-separated values, which can be columns or expressions extracting the values from the table. The expressions defining the composite key must satisfy the conditions described inSection F.37.2.3.

Althoughpg_pathman does not support automatic partition creation by composite key, you can add partitions using theadd_range_partition() function. A typical workflow is as follows:

  1. Enable automatic partition naming for your table by running thecreate_naming_sequence() function.

  2. Create a composite partitioning key.

  3. Register a table you are going to partition withpg_pathman using theadd_to_pathman_config() function.

  4. Add a partition based on the defined composite partitioning key using theadd_range_partition() function.

F.37.2.4.1. Examples

Suppose you have thetest table that stores some temporal data:

CREATE TABLE test (logdate date NOT NULL, comment text);

To partition this data by month and year, you have to create a composite key:

CREATE TYPE test_key AS (year float8, month float8);

To enable automatic partition naming, run thecreate_naming_sequence() function passing the table name as an argument:

SELECT create_naming_sequence('test');

Register thetest table withpg_pathman, specifying the partitioning key you are going to use:

SELECT add_to_pathman_config('test',                             '( extract(year from logdate),                                extract(month from logdate) )::test_key',                             NULL);

Create a partition that includes all the data in the range of ten years, starting from January of the current year:

SELECT add_range_partition('test',                           (extract(year from current_date), 1)::test_key,                           (extract(year from current_date + '10 years'::interval), 1)::test_key);

F.37.2.5. Running Multilevel Partitioning

pg_pathman supports multilevel partitioning for both hash and range partitioning strategies. You can use partitioning strategies in any combination: a hash- or range-partitioned table can be further partitioned by both hash or range.

To split an existing partition into several child ones, use the regularpg_pathman partitioning functions as explained inSection F.37.2.1, passing the name of the partition to be split as theparent_relid parameter. You can check the exact partition names in thepathman_partition_list view.

When opting for the range-range partitioning combination, you can either choose a different partitioning expression, or use the same expression as for the parent table. In the latter case, if the selected range is larger than that of the parent partition, only those child partitions that intersect with the parent range will be in use. Other child partitions will remain empty unless their parent is merged with an adjacent partition that covers at least a part of their range.

F.37.2.5.1. Examples

Suppose you have thejournal table with some logs, which is partitioned by month:

-- create an empty tableCREATE TABLE journal (id      SERIAL,dt      TIMESTAMP NOT NULL,level   INTEGER,msg     TEXT);-- generate some log data into the tableINSERT INTO journal (dt, level, msg)SELECT g, random() * 6, md5(g::text)FROM generate_series('2015-01-01'::date, '2015-12-31'::date, '1 minute') as g;-- partition the table by rangeSELECT create_range_partitions('journal', 'dt', '2015-01-01'::date, '1 month'::interval);

If having smaller partitions makes more sense at some point, you can further split the partitions by hash or range. For example, to split thejournal_1 partition into subpartitions by day, run:

SELECT create_range_partitions('journal_1', 'dt', '2015-01-01'::date, '1 day'::interval);

Similarly, you can use hash partitioning to create child partitions. For example, split thejournal_2 partition into five partitions by hash using theid column as the partitioning key:

SELECT create_hash_partitions('journal_2', 'id', '5');

F.37.2.6. Managing Partitions

pg_pathman provides multiple functions for easy partition management. For details, seeSection F.37.5.3.4.

F.37.3. Examples

F.37.3.1. Common Tips

  • You can addpartition column containing the names of the underlying partitions using the system attribute calledtableoid:

    SELECT tableoid::regclass AS partition, * FROM partitioned_table;
  • Though indices on a parent table are not particularly useful (since the parent table is supposed to be empty), they act as prototypes for indices on partitions. For each index on the parent table,pg_pathman creates a similar index on each partition.

  • All running concurrent partitioning tasks can be listed using thepathman_concurrent_part_tasks view:

    SELECT * FROM pathman_concurrent_part_tasks;userid  | pid  | dbid  | relid | processed | status--------+------+-------+-------+-----------+---------user    | 7367 | 16384 | test  |    472000 | working(1 row)
  • Thepathman_partition_list in conjunction withdrop_range_partition() can be used to drop range partitions in a more flexible way compared toDROP TABLE:

    SELECT drop_range_partition(partition, false) /* move data to parent */FROM pathman_partition_listWHERE parent = 'part_test'::regclass AND range_min::int < 500;NOTICE:  1 rows copied from part_test_11NOTICE:  100 rows copied from part_test_1NOTICE:  100 rows copied from part_test_2drop_range_partition ----------------------dummy_test_11dummy_test_1dummy_test_2(3 rows)
  • You can turn foreign tables into partitions using theattach_range_partition() function. Rows that were meant to be inserted into the parent will be redirected to foreign partitions usingPartitionFilter. By default, it is only allowed to insert rows into partitions provided bypostgres_fdw. This setting is controlled by thepg_pathman.insert_into_fdw variable. You must have superuser rights to change this setting.

F.37.3.2. Hash Partitioning

Consider an example of hash partitioning. First create a table with an integer column:

CREATE TABLE items (id       SERIAL PRIMARY KEY,name     TEXT,code     BIGINT);INSERT INTO items (id, name, code)SELECT g, md5(g::text), random() * 100000FROM generate_series(1, 100000) as g;

Now run thecreate_hash_partitions() function with appropriate arguments:

SELECT create_hash_partitions('items', 'id', 100);

This will create new partitions and move the data from the parent table to partitions.

Here is an example of the query performing filtering by partitioning key:

SELECT * FROM items WHERE id = 1234;  id  |               name               | code------+----------------------------------+------ 1234 | 81dc9bdb52d04dc20036dbd8313ed055 | 1855(1 row)EXPLAIN SELECT * FROM items WHERE id = 1234;QUERY PLAN------------------------------------------------------------------------------------Append  (cost=0.28..8.29 rows=0 width=0)->  Index Scan using items_34_pkey on items_34  (cost=0.28..8.29 rows=0 width=0)Index Cond: (id = 1234)

Notice that theAppend node contains only one child scan, which corresponds to theWHERE clause.

Important

Pay attention to the fact thatpg_pathman excludes the parent table from the query plan.

To access the parent table, use theONLY modifier:

EXPLAIN SELECT * FROM ONLY items;QUERY PLAN------------------------------------------------------Seq Scan on items  (cost=0.00..0.00 rows=1 width=45)

F.37.3.3. Range Partitioning

Consider an example of range partitioning. Let's create a table containing some dummy logs:

CREATE TABLE journal (id      SERIAL,dt      TIMESTAMP NOT NULL,level   INTEGER,msg     TEXT);-- similar index will also be created for each partitionCREATE INDEX ON journal(dt);-- generate some dataINSERT INTO journal (dt, level, msg)SELECT g, random() * 6, md5(g::text)FROM generate_series('2015-01-01'::date, '2015-12-31'::date, '1 minute') as g;

Run thecreate_range_partitions() function to create partitions so that each partition would contain the data for one day:

SELECT create_range_partitions('journal', 'dt', '2015-01-01'::date, '1 day'::interval);

It will create 364 partitions and move the data from the parent table to partitions.

New partitions are appended automatically by insert trigger, but it can be done manually with the following functions:

-- add new partition with specified rangeSELECT add_range_partition('journal', '2016-01-01'::date, '2016-01-07'::date);-- append new partition with default rangeSELECT append_range_partition('journal');

The first one creates a partition with specified range. The second one creates a partition with default interval and appends it to the partition list. It is also possible to attach an existing table as partition. For example, we may want to attach an archive table (or even foreign table from another server) for some outdated data:

CREATE FOREIGN TABLE journal_archive (id      INTEGER NOT NULL,dt      TIMESTAMP NOT NULL,level   INTEGER,msg     TEXT)SERVER archive_server;SELECT attach_range_partition('journal', 'journal_archive', '2014-01-01'::date, '2015-01-01'::date);

Important

The attached table must have the same columns as the partitioned table, except for the dropped columns. The attached columns must have the same type, collation, andNOT NULL settings as the original columns.

To merge two adjacent partitions, use themerge_range_partitions() function:

SELECT merge_range_partitions('journal_archive', 'journal_1');

To split partition by value, use thesplit_range_partition() function:

SELECT split_range_partition('journal_366', '2016-01-03'::date);

To detach partition, use thedetach_range_partition() function:

SELECT detach_range_partition('journal_archive');

Here is an example of the query performing filtering by partitioning key:

SELECT * FROM journal WHERE dt >= '2015-06-01' AND dt < '2015-06-03';id      |         dt          | level |               msg--------+---------------------+-------+----------------------------------217441  | 2015-06-01 00:00:00 |     2 | 15053892d993ce19f580a128f87e3dbf217442  | 2015-06-01 00:01:00 |     1 | 3a7c46f18a952d62ce5418ac2056010c217443  | 2015-06-01 00:02:00 |     0 | 92c8de8f82faf0b139a3d99f2792311d...(2880 rows)EXPLAIN SELECT * FROM journal WHERE dt >= '2015-06-01' AND dt < '2015-06-03';QUERY PLAN------------------------------------------------------------------Append  (cost=0.00..58.80 rows=0 width=0)->  Seq Scan on journal_152  (cost=0.00..29.40 rows=0 width=0)->  Seq Scan on journal_153  (cost=0.00..29.40 rows=0 width=0)(3 rows)

F.37.4. Internals

pg_pathman stores partitioning configuration in thepathman_config table; each row contains a single entry for a partitioned table (relation name, partitioning column and its type). During the initialization stage thepg_pathman module caches some information about child partitions in the shared memory, which is used later for plan construction. Before aSELECT query is executed,pg_pathman traverses the condition tree in search of expressions like:

VARIABLE OP CONST

whereVARIABLE is a partitioning key,OP is a comparison operator (supported operators are =, <, <=, >, >=),CONST is a scalar value. For example:

WHERE id = 150

Based on the partitioning type and condition's operator,pg_pathman searches for the corresponding partitions and builds the plan.

F.37.4.1. Custom Plan Nodes

pg_pathman provides a couple ofcustom plan nodes which aim to reduce execution time, namely:

  • RuntimeAppend (overridesAppend plan node)

  • RuntimeMergeAppend (overridesMergeAppend plan node)

  • PartitionFilter (drop-in replacement for INSERT triggers)

  • PartitionRouter for cross-partition UPDATE queries instead of triggers

PartitionFilter acts as aproxy node for INSERT's child scan, which means it can redirect output tuples to the corresponding partition:

EXPLAIN (COSTS OFF)INSERT INTO partitioned_tableSELECT generate_series(1, 10), random();               QUERY PLAN----------------------------------------- Insert on partitioned_table   ->  Custom Scan (PartitionFilter)         ->  Subquery Scan on "*SELECT*"               ->  Result(4 rows)

PartitionRouter is another proxy node used in conjunction withPartitionFilter to enable cross-partitionUPDATE operations, for example, when you update any column of a partitioning key.

Important

ThePartitionRouter node transforms cross-partitionUPDATE commands intoDELETE +INSERT. OnPostgres Pro versions prior to 11, this operation is unsafe aspg_pathman cannot determine whether the updated row has been deleted or moved to another partition.

By default,PartitionRouter is disabled to avoid undesirable side effects. To enable this node, set thepg_pathman.enable_partitionrouter toon.

EXPLAIN (COSTS OFF)UPDATE partitioned_tableSET value = value + 1 WHERE value = 2;                    QUERY PLAN                     --------------------------------------------------- Update on partitioned_table_0   ->  Custom Scan (PartitionRouter)         ->  Custom Scan (PartitionFilter)               ->  Seq Scan on partitioned_table_0                     Filter: (value = 2)(5 rows)

RuntimeAppend andRuntimeMergeAppend have much in common: they come in handy in a case when WHERE condition takes form of:

VARIABLE OP PARAM

This kind of expressions can no longer be optimized at planning time since the parameter's value is not known until the execution stage takes place. The problem can be solved by embedding theWHERE condition analysis routine into the originalAppend's code, thus making it pick only required scans out of a whole bunch of planned partition scans. This effectively boils down to creation of a custom node capable of performing such a check.

There are at least several cases that demonstrate usefulness of these nodes:

/* create table we're going to partition */CREATE TABLE partitioned_table(id INT NOT NULL, payload REAL);/* insert some data */INSERT INTO partitioned_tableSELECT generate_series(1, 1000), random();/* perform partitioning */SELECT create_hash_partitions('partitioned_table', 'id', 100);/* create ordinary table */CREATE TABLE some_table AS SELECT generate_series(1, 100) AS VAL;
  • id = (select ... limit 1)

    EXPLAIN (COSTS OFF, ANALYZE) SELECT * FROM partitioned_tableWHERE id = (SELECT * FROM some_table LIMIT 1);                                             QUERY PLAN---------------------------------------------------------------------------------------------------- Custom Scan (RuntimeAppend) (actual time=0.030..0.033 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Limit (actual time=0.011..0.011 rows=1 loops=1)           ->  Seq Scan on some_table (actual time=0.010..0.010 rows=1 loops=1)   ->  Seq Scan on partitioned_table_70 partitioned_table (actual time=0.004..0.006 rows=1 loops=1)         Filter: (id = $0)         Rows Removed by Filter: 9 Planning time: 1.131 ms Execution time: 0.075 ms(9 rows)/* disable RuntimeAppend node */SET pg_pathman.enable_runtimeappend = f;EXPLAIN (COSTS OFF, ANALYZE) SELECT * FROM partitioned_tableWHERE id = (SELECT * FROM some_table LIMIT 1);                                    QUERY PLAN---------------------------------------------------------------------------------- Append (actual time=0.196..0.274 rows=1 loops=1)   InitPlan 1 (returns $0)     ->  Limit (actual time=0.005..0.005 rows=1 loops=1)           ->  Seq Scan on some_table (actual time=0.003..0.003 rows=1 loops=1)   ->  Seq Scan on partitioned_table_0 (actual time=0.014..0.014 rows=0 loops=1)         Filter: (id = $0)         Rows Removed by Filter: 6   ->  Seq Scan on partitioned_table_1 (actual time=0.003..0.003 rows=0 loops=1)         Filter: (id = $0)         Rows Removed by Filter: 5         ... /* more plans follow */ Planning time: 1.140 ms Execution time: 0.855 ms(306 rows)

  • id = ANY (select ...)

    EXPLAIN (COSTS OFF, ANALYZE) SELECT * FROM partitioned_tableWHERE id = any (SELECT * FROM some_table limit 4);                                                QUERY PLAN----------------------------------------------------------------------------------------------------------- Nested Loop (actual time=0.025..0.060 rows=4 loops=1)   ->  Limit (actual time=0.009..0.011 rows=4 loops=1)         ->  Seq Scan on some_table (actual time=0.008..0.010 rows=4 loops=1)   ->  Custom Scan (RuntimeAppend) (actual time=0.002..0.004 rows=1 loops=4)         ->  Seq Scan on partitioned_table_70 partitioned_table (actual time=0.001..0.001 rows=10 loops=1)         ->  Seq Scan on partitioned_table_26 partitioned_table (actual time=0.002..0.003 rows=9 loops=1)         ->  Seq Scan on partitioned_table_27 partitioned_table (actual time=0.001..0.002 rows=20 loops=1)         ->  Seq Scan on partitioned_table_63 partitioned_table (actual time=0.001..0.002 rows=9 loops=1) Planning time: 0.771 ms Execution time: 0.101 ms(10 rows)/* disable RuntimeAppend node */SET pg_pathman.enable_runtimeappend = f;EXPLAIN (COSTS OFF, ANALYZE) SELECT * FROM partitioned_tableWHERE id = any (SELECT * FROM some_table limit 4);                                       QUERY PLAN----------------------------------------------------------------------------------------- Nested Loop Semi Join (actual time=0.531..1.526 rows=4 loops=1)   Join Filter: (partitioned_table.id = some_table.val)   Rows Removed by Join Filter: 3990   ->  Append (actual time=0.190..0.470 rows=1000 loops=1)         ->  Seq Scan on partitioned_table (actual time=0.187..0.187 rows=0 loops=1)         ->  Seq Scan on partitioned_table_0 (actual time=0.002..0.004 rows=6 loops=1)         ->  Seq Scan on partitioned_table_1 (actual time=0.001..0.001 rows=5 loops=1)         ->  Seq Scan on partitioned_table_2 (actual time=0.002..0.004 rows=14 loops=1)... /* 96 scans follow */   ->  Materialize (actual time=0.000..0.000 rows=4 loops=1000)         ->  Limit (actual time=0.005..0.006 rows=4 loops=1)               ->  Seq Scan on some_table (actual time=0.003..0.004 rows=4 loops=1) Planning time: 2.169 ms Execution time: 2.059 ms(110 rows)

  • NestLoop involving a partitioned table, which is omitted since it's occasionally shown above.

To learn more about custom nodes, see Alexander Korotkov'sblog.

F.37.5. Reference

F.37.5.1. GUC Variables

There are several user-accessibleGUC variables designed to togglepg_pathman or its specific custom nodes on and off.

  • pg_pathman.enable — enable/disable thepg_pathman module.

    Default:on

  • pg_pathman.enable_runtimeappend — toggle theRuntimeAppend custom node on/off.

    Default:on

  • pg_pathman.enable_runtimemergeappend — toggle theRuntimeMergeAppend custom node on/off.

    Default:on

  • pg_pathman.enable_partitionfilter — toggle thePartitionFilter custom node on/off to enable/disable cross-partitionINSERT operations.

    Default:on

  • pg_pathman.enable_partitionrouter — toggle thePartitionRouter custom node on/off to enable/disable cross-partitionUPDATE operations.

    Default:off

  • pg_pathman.enable_auto_partition — toggle automatic partition creation on/off (per session).

    Default:on

  • pg_pathman.enable_bounds_cache — toggle bounds cache on/off.

    Default:on

  • pg_pathman.insert_into_fdw — allowINSERT operations into various foreign-data wrappers. Possible values:disabled,postgres, andany_fdw.

    Default:postgres

  • pg_pathman.override_copy — toggleCOPY statement hooking on/off.

    Default:on

F.37.5.2. Views and Tables

F.37.5.2.1. pathman_config

This table stores the list of partitioned tables. This is the main configuration storage.

CREATE TABLE IF NOT EXISTS pathman_config (    partrel         REGCLASS NOT NULL PRIMARY KEY,    attname         TEXT NOT NULL,    parttype        INTEGER NOT NULL,    range_interval  TEXT);
F.37.5.2.2. pathman_config_params

This table stores optional parameters that override standardpg_pathman behavior.

CREATE TABLE IF NOT EXISTS pathman_config_params (    partrel        REGCLASS NOT NULL PRIMARY KEY,    enable_parent  BOOLEAN NOT NULL DEFAULT TRUE,    auto           BOOLEAN NOT NULL DEFAULT TRUE,    init_callback  REGPROCEDURE NOT NULL DEFAULT 0,    spawn_using_bgw BOOLEAN NOT NULL DEFAULT FALSE);
F.37.5.2.3. pathman_concurrent_part_tasks

This view lists all currently running concurrent partitioning tasks.

-- helper SRF functionCREATE OR REPLACE FUNCTION show_concurrent_part_tasks()RETURNS TABLE (    userid     REGROLE,    pid        INT,    dbid       OID,    relid      REGCLASS,    processed  INT,    status     TEXT)AS 'pg_pathman', 'show_concurrent_part_tasks_internal'LANGUAGE C STRICT;CREATE OR REPLACE VIEW pathman_concurrent_part_tasksAS SELECT * FROM show_concurrent_part_tasks();
F.37.5.2.4. pathman_partition_list

This view lists all existing partitions, as well as their parents and range boundaries (NULL for hash partitions).

-- helper SRF functionCREATE OR REPLACE FUNCTION show_partition_list()RETURNS TABLE (    parent     REGCLASS,    partition  REGCLASS,    parttype   INT4,    expr       TEXT,    range_min  TEXT,    range_max  TEXT)AS 'pg_pathman', 'show_partition_list_internal'LANGUAGE C STRICT;CREATE OR REPLACE VIEW pathman_partition_listAS SELECT * FROM show_partition_list();

F.37.5.3. Functions

F.37.5.3.1. Partitioning Functions
create_hash_partitions(parent_relid     REGCLASS,                       expression       TEXT,                       partitions_count INTEGER,                       partition_data   BOOLEAN DEFAULT TRUE,                       partition_names  TEXT[] DEFAULT NULL,                       tablespaces      TEXT[] DEFAULT NULL)

Performs hash partitioning forrelation by integer keyexpression. Thepartitions_count parameter specifies the number of partitions to create; it cannot be changed afterwards. Ifpartition_data istrue, all the data will be automatically migrated from the parent table to partitions. Note that data migration may take a while to finish and the table will be locked until transaction commits. Seepartition_table_concurrently() for a lock-free way to migrate data. Partition creation callback is invoked for each partition if set beforehand (seeset_init_callback()).

create_range_partitions(relation       REGCLASS,                        expression     TEXT,                        start_value    ANYELEMENT,                        p_interval     ANYELEMENT,                        p_count        INTEGER DEFAULT NULL,                        partition_data BOOLEAN DEFAULT TRUE)create_range_partitions(relation       REGCLASS,                        expression     TEXT,                        start_value    ANYELEMENT,                        p_interval     INTERVAL,                        p_count        INTEGER DEFAULT NULL,                        partition_data BOOLEAN DEFAULT TRUE)create_range_partitions(relation        REGCLASS,                        expression      TEXT,                        bounds          ANYARRAY,                        partition_names TEXT[] DEFAULT NULL,                        tablespaces     TEXT[] DEFAULT NULL,                        partition_data  BOOLEAN DEFAULT TRUE)

Performs range partitioning forrelation by partitioning key defined byexpression. Thestart_value argument specifies the initial value,p_interval sets the default range for automatically created partitions or partitions created withappend_range_partition() orprepend_range_partition(). Ifp_interval is set toNULL, automatic partition creation is disabled.p_count is the number of premade partitions. Ifp_count is not set, thanpg_pathman tries to determine the number of partitions based on theexpression value. Thebounds array defines the bounds for partitions to be created. You can build this array using thegenerate_range_bounds() function. Partition creation callback is invoked for each partition if set beforehand.

F.37.5.3.2. Data Migration Functions
partition_table_concurrently(relation REGCLASS,                             batch_size INTEGER DEFAULT 1000,                             sleep_time FLOAT8 DEFAULT 1.0)

Starts a background worker to move data from parent table to partitions. The worker utilizes short transactions to copy small batches of data (up to 10K rows per transaction) and thus doesn't significantly interfere with user's activity.

stop_concurrent_part_task(relation REGCLASS)

Stops a background worker performing a concurrent partitioning task. Note: worker will exit after it finishes relocating a current batch.

F.37.5.3.3. Triggers

Triggers are no longer required forINSERT and cross-partitionUPDATE operations. However, user-supplied triggers are supported:

  • Each inserted row results in execution of BEFORE/AFTER INSERT trigger functions of a corresponding partition.

  • Each updated row results in execution of BEFORE/AFTER UPDATE trigger functions of a corresponding partition.

  • Each moved row (cross-partition update) results in execution of BEFORE UPDATE + BEFORE/AFTER DELETE + BEFORE/AFTER INSERT trigger functions of corresponding partitions.

F.37.5.3.4. Partition Management Functions
replace_hash_partition(old_partition       REGCLASS,                       new_partition       REGCLASS,                       lock_parent         BOOLEAN DEFAULT TRUE)

Replaces the specified partition of hash-partitioned table with another table. When set totrue, thelock_parent parameter prevents anyINSERT/UPDATE/ALTER TABLE queries to the parent table.

split_range_partition(partition_relid  REGCLASS,                      split_value      ANYELEMENT,                      partition_name   TEXT DEFAULT NULL,                      tablespace       TEXT DEFAULT NULL)

Split rangepartition in two byvalue, with the specifiedvalue included into the second partition. Partition creation callback is invoked for a new partition if available.

merge_range_partitions(variadic partitions REGCLASS[])

Merge several adjacent range partitions. Partitions are automatically ordered by increasing bounds. All the data will be accumulated in the first partition, while other merged partitions are removed. If the remaining partition has any child partitions, new child partitions for the merged data will be created as required using the same partitioning expression.

append_range_partition(parent_relid   REGCLASS,                       partition_name TEXT DEFAULT NULL,                       tablespace     TEXT DEFAULT NULL)

Append new range partition withpathman_config.range_interval as interval.

prepend_range_partition(parent_relid   REGCLASS,                        partition_name TEXT DEFAULT NULL,                        tablespace     TEXT DEFAULT NULL)

Prepend new range partition withpathman_config.range_interval as interval.

add_range_partition(parent_relid   REGCLASS,                    start_value    ANYELEMENT,                    end_value      ANYELEMENT,                    partition_name TEXT DEFAULT NULL,                    tablespace     TEXT DEFAULT NULL)

Create a new range partition forrelation with the specified range bounds. If thestart_value or theend_value is NULL, the corresponding range bound will be infinite.

drop_range_partition(partition_relid TEXT, delete_data BOOLEAN DEFAULT TRUE)

Drop range partition and all of its data ifdelete_data is true.

attach_range_partition(parent_relid    REGCLASS,                       partition_relid REGCLASS,                       start_value     ANYELEMENT,                       end_value       ANYELEMENT)

Attach partition to the existing range-partitioned relation. The attached table must have exactly the same structure as the parent table, including the dropped columns. Partition creation callback is invoked if set (seeSection F.37.5.2.2).

detach_range_partition(partition_relid REGCLASS)

Detach partition from the existing range-partitioned relation.

disable_pathman_for(parent_relid REGCLASS)

Permanently disablepg_pathman partitioning mechanism for the specified parent table and remove the insert trigger if it exists. All partitions and data remain unchanged.

drop_partitions(parent_relid REGCLASS,                delete_data  BOOLEAN DEFAULT FALSE)

Drop partitions of the parent table (both foreign and local relations). Ifdelete_data isfalse, the data is copied to the parent table first. Default isfalse.

F.37.5.3.5. Additional Functions
pathman_version()

Returns thepg_pathman version number.

set_interval(relation REGCLASS, value ANYELEMENT)

Update range-partitioned table interval. Note that interval must not be negative and it must not be trivial, i.e. its value should be greater than zero for numeric types, at least 1 microsecond fortimestamp and at least 1 day fordate.

set_enable_parent(relation REGCLASS, value BOOLEAN)

Include/exclude parent table into/from query plan. In original Postgres Pro planner parent table is always included into query plan even if it's empty, which can lead to additional overhead. You can usedisable_parent() if you are never going to use parent table as a storage. Default value depends on thepartition_data parameter specified during initial partitioning with thecreate_range_partitions() function. If thepartition_data parameter wastrue, then all data have already been migrated to partitions and the parent table is disabled. Otherwise, it is enabled.

set_auto(relation REGCLASS, value BOOLEAN)

Enable/disable auto partition propagation (only for range partitioning). It is enabled by default.

set_init_callback(relation REGCLASS, callback REGPROCEDURE DEFAULT 0)

Set partition creation callback to be invoked for each attached or created partition (both hash and range). If callback is marked withSECURITY INVOKER, it is executed with the privileges of the user who produced a statement that has led to creation of a new partition. For example:

INSERT INTO partitioned_table VALUES (-5)

The callback must have the following signature:part_init_callback(args JSONB) RETURNS VOID. Parameterarg consists of several fields whose presence depends on partitioning type:

/* Range-partitioned table abc (child abc_4) */{    "parent":    "abc",    "parttype":  "2",    "partition": "abc_4",    "range_max": "401",    "range_min": "301"}/* Hash-partitioned table abc (child abc_0) */{    "parent":    "abc",    "parttype":  "1",    "partition": "abc_0"}
set_spawn_using_bgw(relation REGCLASS, value BOOLEAN)

When inserting new data beyond the partitioning range, use SpawnPartitionsWorker to create new partitions in a separate transaction.

create_naming_sequence(parent_relid REGCLASS)

Enable automatic partition naming for the specifiedrelation table. You must run this function when partitioning this table by composite key.

add_to_pathman_config(parent_relid     REGCLASS,                      expression       TEXT,                      range_interval   TEXT)add_to_pathman_config(parent_relid     REGCLASS,                      expression       TEXT)

Register the specifiedrelation table withpg_pathman to enable partitioning by the providedexpression. For range partitioning, therange_interval argument is mandatory. You can set it toNULL if you are going to add partition manually.

generate_range_bounds(p_start     ANYELEMENT,                      p_interval  INTERVAL,                      p_count     INTEGER)generate_range_bounds(p_start     ANYELEMENT,                      p_interval  ANYELEMENT,                      p_count     INTEGER)

Build thebounds array that defines the bounds for partitions to be created. You can pass this array as an argument to thecreate_range_partitions() function.

F.37.6. Authors

  • Ildar Musin

  • Alexander Korotkov

  • Dmitry Ivanov


Prev Up Next
F.36. pg_freespacemap Home F.38. pg_proaudit
pdfepub
Go to Postgres Pro Standard 14
By continuing to browse this website, you agree to the use of cookies. Go toPrivacy Policy.

[8]ページ先頭

©2009-2025 Movatter.jp