postgrespro/pg_query_statePublic

NotificationsYou must be signed in to change notification settings
Fork28
Star158

Tool for query progress monitoring in PostgreSQL

License

View license

158 stars 28 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
expected		expected
patches		patches
specs		specs
t		t
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile.tmpl		Dockerfile.tmpl
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
init.sql		init.sql
meson.build		meson.build
mk_dockerfile.sh		mk_dockerfile.sh
pg_query_state--1.0--1.1.sql		pg_query_state--1.0--1.1.sql
pg_query_state--1.1--1.2.sql		pg_query_state--1.1--1.2.sql
pg_query_state.c		pg_query_state.c
pg_query_state.control		pg_query_state.control
pg_query_state.h		pg_query_state.h
run_tests.sh		run_tests.sh
signal_handler.c		signal_handler.c
test.conf		test.conf

Repository files navigation

pg_query_state

Thepg_query_state module provides facility to know the current state of query execution on working backend. To enable this extension you have to patch the stable version of PostgreSQL, recompile it and deploy new binaries. All patch files are located inpatches/ directory and tagged with suffix of PostgreSQL version number.

Overview

Each nonutility query statement (SELECT/INSERT/UPDATE/DELETE) after optimization/planning stage is translated into plan tree which is kind of imperative representation of SQL query execution algorithm. EXPLAIN ANALYZE request allows to demonstrate execution statistics gathered from each node of plan tree (full time of execution, number rows emitted to upper nodes, etc). But this statistics is collected after execution of query. This module allows to show actual statistics of query running gathered from external backend. At that, format of resulting output is almost identical to ordinal EXPLAIN ANALYZE. Thus users are able to track of query execution in progress.

In fact, this module is able to explore external backend and determine its actual state. Particularly it's helpful when backend executes a heavy query and gets stuck.

Use cases

Using this module there can help in the following things:

detect a long query (along with other monitoring tools)
overwatch the query execution

Installation

To installpg_query_state, please apply corresponding patchescustom_signal_(PG_VERSION).patch andruntime_explain_(PG_VERSION).patch (orruntime_explain.patch for PG version <= 10.0) from thepatches/ directory to reqired stable version of PostgreSQL and rebuild PostgreSQL.

To do this, run the following commands from the postgresql directory:

patch -p1 < path_to_pg_query_state_folder/patches/runtime_explain_(PG_VERSION).patchpatch -p1 < path_to_pg_query_state_folder/patches/custom_signals_(PG_VERSION).patch

Then execute this in the module's directory:

make install USE_PGXS=1

To execute the command correctly, make sure you have the PATH or PG_CONFIG variable set.

export PATH=path_to_your_bin_folder:$PATH# orexport PG_CONFIG=path_to_your_bin_folder/pg_config

Add module name to theshared_preload_libraries parameter inpostgresql.conf:

shared_preload_libraries = 'pg_query_state'

It is essential to restart the PostgreSQL instance. After that, execute the following query in psql:

CREATE EXTENSION pg_query_state;

Done!

Tests

Test using parallel sessions with Python 3+ compatible script:

python3 tests/pg_qs_test_runner.py [OPTION]...

prerequisite packages:

psycopg2 version 2.6 or later
PyYAML version 3.11 or later
progressbar2 for stress test progress reporting

options:

- -host --- postgres server host, default value islocalhost
- -port --- postgres server port, default value is5432
- -database --- database name, default value ispostgres
- -user --- user name, default value ispostgres
- -password --- user's password, default value is empty
- -tpc-ds-setup --- setup database to run TPC-DS benchmark
- -tpc-ds-run --- runs only stress tests on TPC-DS benchmark

Or run all tests inDocker using:

export LEVEL=hardcoreexport USE_TPCDS=1export PG_VERSION=12./mk_dockerfile.shdocker-compose builddocker-compose run tests

There are different test levels:hardcore,nightmare (runs tests undervalgrind) andstress (runs tests underTPC-DS load).

Function pg_query_state

pg_query_state(integer     pid,        verboseboolean DEFAULT FALSE,        costsboolean DEFAULT FALSE,        timingboolean DEFAULT FALSE,        buffersboolean DEFAULT FALSE,        triggersboolean DEFAULT FALSE,        formattext    DEFAULT'text') returns TABLE (    pidinteger,    frame_numberinteger,    query_texttext,    plantext,    leader_pidinteger)

extracts the current query state from backend with specifiedpid. Since parallel query can spawn multiple workers and function call causes nested subqueries so that state of execution may be viewed as stack of running queries, return value ofpg_query_state has typeTABLE (pid integer, frame_number integer, query_text text, plan text, leader_pid integer). It represents tree structure consisting of leader process and its spawned workers identified bypid. Each worker refers to leader throughleader_pid column. For leader process the value of this column is null. The state of each process is represented as stack of function calls. Each frame of that stack is specified as correspondence betweenframe_number starting from zero,query_text andplan with online statistics columns.

Thus, user can see the states of main query and queries generated from function calls for leader process and all workers spawned from it.

In process of execution some nodes of plan tree can take loops of full execution. Therefore statistics for each node consists of two parts: average statistics for previous loops just like in EXPLAIN ANALYZE output and statistics for current loop if node have not finished.

Optional arguments:

verbose --- use EXPLAIN VERBOSE for plan printing;
costs --- add costs for each node;
timing --- print timing data for each node, if collecting of timing statistics is turned off on called side resulting output will contain WARNING messagetiming statistics disabled;
buffers --- print buffers usage, if collecting of buffers statistics is turned off on called side resulting output will contain WARNING messagebuffers statistics disabled;
triggers --- include triggers statistics in result plan trees;
format --- EXPLAIN format to be used for plans printing, possible values: {text,xml,json,yaml}.

If callable backend is not executing any query the function prints INFO message about backend's state taken frompg_stat_activity view if it exists there.

Warning: Calling role have to be superuser or member of the role whose backend is being called. Otherwise function prints ERROR messagepermission denied.

Configuration settings

There are several user-accessibleGUC variables designed to toggle the whole module and the collecting of specific statistic parameters while query is running:

pg_query_state.enable --- disable (or enable)pg_query_state completely, default value istrue
pg_query_state.enable_timing --- collect timing data for each node, default value isfalse
pg_query_state.enable_buffers --- collect buffers usage, default value isfalse

This parameters is set on called side before running any queries whose states are attempted to extract.Warning: ifpg_query_state.enable_timing is turned off the calling side cannot get time statistics, similarly forpg_query_state.enable_buffers parameter.

Examples

Set maximum number of parallel workers ongather node equals2:

postgres=# set max_parallel_workers_per_gather = 2;

Assume one backend with pid = 49265 performs a simple query:

postgres=# select pg_backend_pid(); pg_backend_pid----------------49265(1 row)postgres=# select count(*) from foo join bar on foo.c1=bar.c1;

Other backend can extract intermediate state of execution that query:

postgres=# \xpostgres=# select * from pg_query_state(49265);-[ RECORD1 ]+-------------------------------------------------------------------------------------------------------------------------pid          |49265frame_number |0query_text   |selectcount(*)from foojoin baronfoo.c1=bar.c1;plan         | Finalize Aggregate (Current loop: actual rows=0, loopnumber=1)+             |->  Gather (Current loop: actual rows=0, loopnumber=1)+             |         Workers Planned:2+             |         Workers Launched:2+             |->  Partial Aggregate (Current loop: actual rows=0, loopnumber=1)+             |->  Nested Loop (Current loop: actual rows=12, loopnumber=1)+             |Join Filter: (foo.c1=bar.c1)+             |                     Rows Removed byJoin Filter:5673232+             |->  Parallel Seq Scanon foo (Current loop: actual rows=12, loopnumber=1)+             |->  Seq Scanon bar (actual rows=500000 loops=11) (Current loop: actual rows=173244, loopnumber=12)leader_pid   | (null)-[ RECORD2 ]+-------------------------------------------------------------------------------------------------------------------------pid          |49324frame_number |0query_text   |<parallel query>plan         | Partial Aggregate (Current loop: actual rows=0, loopnumber=1)+             |->  Nested Loop (Current loop: actual rows=10, loopnumber=1)+             |Join Filter: (foo.c1=bar.c1)+             |         Rows Removed byJoin Filter:4896779+             |->  Parallel Seq Scanon foo (Current loop: actual rows=10, loopnumber=1)+             |->  Seq Scanon bar (actual rows=500000 loops=9) (Current loop: actual rows=396789, loopnumber=10)leader_pid   |49265-[ RECORD3 ]+-------------------------------------------------------------------------------------------------------------------------pid          |49323frame_number |0query_text   |<parallel query>plan         | Partial Aggregate (Current loop: actual rows=0, loopnumber=1)+             |->  Nested Loop (Current loop: actual rows=11, loopnumber=1)+             |Join Filter: (foo.c1=bar.c1)+             |         Rows Removed byJoin Filter:5268783+             |->  Parallel Seq Scanon foo (Current loop: actual rows=11, loopnumber=1)+             |->  Seq Scanon bar (actual rows=500000 loops=10) (Current loop: actual rows=268794, loopnumber=11)leader_pid   |49265

In example above working backend spawns two parallel workers with pids49324 and49323. Theirleader_pid column's values clarify that these workers belong to the main backend.Seq Scan node has statistics on passed loops (average number of rows delivered toNested Loop and number of passed loops are shown) and statistics on current loop. Other nodes has statistics only for current loop as this loop is first (loop number = 1).

Assume first backend executes some function:

postgres=# select n_join_foo_bar();

Other backend can get the follow output:

postgres=# select * from pg_query_state(49265);-[ RECORD1 ]+------------------------------------------------------------------------------------------------------------------pid          |49265frame_number |0query_text   |select n_join_foo_bar();plan         | Result (Current loop: actual rows=0, loopnumber=1)leader_pid   | (null)-[ RECORD2 ]+------------------------------------------------------------------------------------------------------------------pid          |49265frame_number |1query_text   |SELECT (selectcount(*)from foojoin baronfoo.c1=bar.c1)plan         | Result (Current loop: actual rows=0, loopnumber=1)+             |   InitPlan1 (returns $0)+             |->  Aggregate (Current loop: actual rows=0, loopnumber=1)+             |->  Nested Loop (Current loop: actual rows=51, loopnumber=1)+             |Join Filter: (foo.c1=bar.c1)+             |                 Rows Removed byJoin Filter:51636304+             |->  Seq Scanon bar (Current loop: actual rows=52, loopnumber=1)+             |->  Materialize (actual rows=1000000 loops=51) (Current loop: actual rows=636355, loopnumber=52)+             |->  Seq Scanon foo (Current loop: actual rows=1000000, loopnumber=1)leader_pid   | (null)

First row corresponds to function call, second - to query which is in the body of that function.

We can get result plans in different format (e.g.json):

postgres=# select * from pg_query_state(pid := 49265, format := 'json');-[ RECORD1 ]+------------------------------------------------------------pid          |49265frame_number |0query_text   |select*from n_join_foo_bar();plan         | {+             |"Plan": {+             |"Node Type":"Function Scan",+             |"Parallel Aware": false,+             |"Function Name":"n_join_foo_bar",+             |"Alias":"n_join_foo_bar",+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":0+             |     }+             |   }+             | }leader_pid   | (null)-[ RECORD2 ]+------------------------------------------------------------pid          |49265frame_number |1query_text   |SELECT (selectcount(*)from foojoin baronfoo.c1=bar.c1)plan         | {+             |"Plan": {+             |"Node Type":"Result",+             |"Parallel Aware": false,+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":0+             |     },+             |"Plans": [+             |       {+             |"Node Type":"Aggregate",+             |"Strategy":"Plain",+             |"Partial Mode":"Simple",+             |"Parent Relationship":"InitPlan",+             |"Subplan Name":"InitPlan 1 (returns $0)",+             |"Parallel Aware": false,+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":0+             |         },+             |"Plans": [+             |           {+             |"Node Type":"Nested Loop",+             |"Parent Relationship":"Outer",+             |"Parallel Aware": false,+             |"Join Type":"Inner",+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":610+             |             },+             |"Join Filter":"(foo.c1 = bar.c1)",+             |"Rows Removed by Join Filter":610072944,+             |"Plans": [+             |               {+             |"Node Type":"Seq Scan",+             |"Parent Relationship":"Outer",+             |"Parallel Aware": false,+             |"Relation Name":"bar",+             |"Alias":"bar",+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":611+             |                 }+             |               },+             |               {+             |"Node Type":"Materialize",+             |"Parent Relationship":"Inner",+             |"Parallel Aware": false,+             |"Actual Rows":1000000,+             |"Actual Loops":610,+             |"Current loop": {+             |"Actual Loop Number":611,+             |"Actual Rows":73554+             |                 },+             |"Plans": [+             |                   {+             |"Node Type":"Seq Scan",+             |"Parent Relationship":"Outer",+             |"Parallel Aware": false,+             |"Relation Name":"foo",+             |"Alias":"foo",+             |"Current loop": {+             |"Actual Loop Number":1,+             |"Actual Rows":1000000+             |                     }+             |                   }+             |                 ]+             |               }+             |             ]+             |           }+             |         ]+             |       }+             |     ]+             |   }+             | }leader_pid   | (null)

Feedback

Do not hesitate to post your issues, questions and new ideas at theissues page.

Authors

Maksim Milyutin
Alexey Kondratova.kondratov@postgrespro.ru Postgres Professional Ltd., Russia

Function progress_bar

pg_progress_bar(integer     pid) returns FLOAT

extracts the current query state from backend with specified 'pid'. Then gets the numerical values of the actual rows and total rows and count progress for the whole query tree. Function returns numeric value from 0 to 1 describing the measure of query fulfillment. If there is no information about current state of the query, or the impossibility of counting, the corresponding messages will be displayed.

Function progress_bar_visual

pg_progress_bar_visual(integer     pid,integer     delay) returns VOID

cyclically extracts and print the current query state in numeric value from backend with specified 'pid' every period specified by 'delay' in seconds. This is the looping version of the progress_bar function that returns void value.

Warning: Calling role have to be superuser or member of the role whose backend is being called. Otherwise function prints ERROR messagepermission denied.

Examples

Assume first backend executes some function:

postgres=# insert into table_name select generate_series(1,10000000);

Other backend can get the follow output:

postgres=# SELECT pid FROM pg_stat_activity where query like 'insert%';  pid-------23877(1 row)postgres=# SELECT pg_progress_bar(23877); pg_progress_bar-----------------0.6087927(1 row)

Or continuous version:

postgres=# SELECT pg_progress_bar_visual(23877, 1);Progress=0.043510Progress=0.085242Progress=0.124921Progress=0.168168Progress=0.213803Progress=0.250362Progress=0.292632Progress=0.331454Progress=0.367509Progress=0.407450Progress=0.448646Progress=0.488171Progress=0.530559Progress=0.565558Progress=0.608039Progress=0.645778Progress=0.654842Progress=0.699006Progress=0.735760Progress=0.787641Progress=0.832160Progress=0.871077Progress=0.911858Progress=0.956362Progress=0.995097Progress=1.000000 pg_progress_bar_visual------------------------1(1 row)

Also uncountable queries exist. Assume first backend executes some function:

DELETEfrom table_name;

Other backend can get the follow output:

postgres=# SELECT pid FROM pg_stat_activity where query like 'delete%';  pid-------23877(1 row)postgres=# SELECT pg_progress_bar(23877);INFO:  could not get query execution progress pg_progress_bar------------------1(1 row)postgres=# SELECT pg_progress_bar_visual(23877, 5);INFO:  could not get query execution progress pg_progress_bar_visual-------------------------1(1 row)

Reinstallation

If you already have a module 'pg_query_state' without progress bar functions installed, execute this in the module's directory:

make install USE_PGXS=1

It is essential to restart the PostgreSQL instance. After that, execute the following queries in psql:

DROP EXTENSION IF EXISTS pg_query_state;CREATE EXTENSION pg_query_state;

Authors

Ekaterina Sokolovae.sokolova@postgrespro.ru Postgres Professional Ltd., RussiaVyacheslav Makarovv.makarov@postgrespro.ru Postgres Professional Ltd., Russia

About

Tool for query progress monitoring in PostgreSQL

Releases2

Release 1.1.1 Latest

Nov 28, 2019

+ 1 release

Packages

No packages published

Contributors17

+ 3 contributors

Movatterモバイル変換

License

postgrespro/pg_query_state

Folders and files

Latest commit

History

Repository files navigation

pg_query_state

Overview

Use cases

Installation

Tests

Function pg_query_state

Configuration settings

Examples

Feedback

Authors

Function progress_bar

Function progress_bar_visual

Examples

Reinstallation

Authors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases2

Packages0

Uh oh!

Contributors17

Uh oh!

Languages

Packages