NotificationsYou must be signed in to change notification settings
Fork14
Star162

Commit433f116

committed

Day 56

1 parented1cde3 commit433f116Copy full SHA for 433f116

File tree

2 files changed

+105

-0

lines changed

2 files changed

+105

-0

lines changed

`‎0056_how_to_imitate_production_planner.md`

Lines changed: 104 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,104 @@`
	`1`	`+Originally from:[tweet](https://twitter.com/samokhvalov/status/1726887992340738324),[LinkedIn post]().`
	`2`	`+`
	`3`	`+---`
	`4`	`+`
	`5`	`+#How to make the non-production Postgres planner behave like in production`
	`6`	`+`
	`7`	`+>I post a new PostgreSQL "howto" article every day. Join me in this`
	`8`	`+>journey –[subscribe](https://twitter.com/samokhvalov/), provide feedback, share!`
	`9`	`+`
	`10`	+For query optimization goals – running`EXPLAIN (ANALYZE, BUFFERS)` and verifying various optimization ideas such as a
	`11`	`+new index – it is crucial to ensure that the Postgres planner behaves exactly like or similarly to the planner working`
	`12`	`+with the production database. Fortunately, this is possible regardless of the resources (CPU, RAM, disk IO) available on`
	`13`	`+the non-production machine.`
	`14`	`+`
	`15`	`+The technique described here is used in DBLab Engine`
	`16`	`+[@Database_Lab](https://twitter.com/Database_Lab)`
	`17`	`+to support fast database cloning/branching for query optimization and database testing in CI/CD.`
	`18`	`+`
	`19`	`+To achieve the planner's prod/non-prod behavior parity, two components are needed:`
	`20`	`+`
	`21`	`+1) Matching database settings`
	`22`	+2) The same or very similar statistic (the content of`pg_statistic`)
	`23`	`+`
	`24`	`+##Matching database settings`
	`25`	`+`
	`26`	`+Use this query on the production database to obtain the list of non-default settings that influence the planner's`
	`27`	`+behavior:`
	`28`	`+`
	`29`	+```sql
	`30`	`+select`
	`31`	`+ format(e'%s = %s', name, setting)as configs`
	`32`	`+from`
	`33`	`+ pg_settings`
	`34`	`+where`
	`35`	`+ source<>'default'`
	`36`	`+and (`
	`37`	`+ name ~'(work_mem$\|^enable_\|_cost$\|scan_size$\|effective_cache_size\|^jit)'`
	`38`	`+or name ~'(^geqo\|default_statistics_target\|constraint_exclusion\|cursor_tuple_fraction)'`
	`39`	`+or name ~'(collapse_limit$\|parallel\|plan_cache_mode)'`
	`40`	`+ );`
	`41`	+```
	`42`	`+`
	`43`	`+Notes:`
	`44`	`+`
	`45`	`+- The planner's behavior doesn't depend on the actual resources available such as CPU or RAM. Nor does it depend on OS,`
	`46`	`+ FS or their settings.`
	`47`	+- The value of`shared_buffers` doesn't matter(!) – it will only affect the executor's behavior and buffer pool's
	`48`	+ hit/read ratio. What does matter for the planner is`effective_cache_size` and you can set it to a value that
	`49`	`+ significantly exceed the actual RAM available, "fooling" the planner in a good sense, achieving the goal to match the`
	`50`	+ production planner behavior. So you can have, say, 1 TiB of RAM and`shared_buffers = '250GB'` in production and
	`51`	+`effective_cache_size = '750GB'`, and be able to effectively analyze and optimize queries on a small 8-GiB machine
	`52`	+ with`shared_buffers = '2GB'` and`effective_cache_size = '750GB'`. The planner will assume you have a lot of RAM when
	`53`	`+ choosing the optimal plans.`
	`54`	+- Similarly, you can have fast SSDs in production and use`random_page_cost = 1.1` there, and very slow cheap magnetic
	`55`	+ disks in your testing environments. With`random_page_cost = 1.1` being used there, the planner will consider random
	`56`	`+ access to be not costly. The execution will be still slow, but the plan chosen and the data volumes (actual rows,`
	`57`	`+ buffer numbers) will be the same or very close to production.`
	`58`	`+`
	`59`	`+##Matching statistics`
	`60`	`+`
	`61`	+There is a quite interesting and exotic idea of dumping/restoring the content of`pg_statistic` from production to testing
	`62`	`+environments of much smaller size, not having the actual data there. There is not a very popular solution implementing`
	`63`	`+this idea; possibly,[pg_dbms_stats](https://github.com/ossc-db/pg_dbms_stats/blob/master/doc/pg_dbms_stats-en.md) can`
	`64`	`+help with it.`
	`65`	`+`
	`66`	`+>🎯TODO: Test and see if it works.`
	`67`	`+`
	`68`	`+A straightforward idea is to have the same-size database. Two options are possible here:`
	`69`	`+`
	`70`	+*Option 1: physical copy. It can be done using`pg_basebackup`, copying`PGDATA` using rsync or an alternative, or
	`71`	+ restoring from physical backups such as`WAL-G`'s or`pgBackRest`'s.
	`72`	`+`
	`73`	`+*Option 2: logical copy. It can be done using dump/restore or creating a logical replica.`
	`74`	`+`
	`75`	`+Notes:`
	`76`	`+`
	`77`	`+- The physical method of provisioning gives the best results: not only will "actual rows" match, but also the buffer`
	`78`	+ numbers provided by running`EXPLAIN (ANALYZE, BUFFERS)`, because the number of pages (`relpages`) is the same, the
	`79`	`+ bloat`
	`80`	`+ is preserved, and so on. This is the best method for optimization and fine-tuning the queries.`
	`81`	`+- The logical method gives the matching row counts, but the size of tables and indexes is going to be`
	`82`	+ different –`relpages` is smaller in a freshly provisioned node, the bloat is not preserved, and tuples, generally,
	`83`	`+ are stored in a different order (we can call this bloat "good" since we want to have it in testing environments to`
	`84`	`+ match the production state). This method still enables quite efficient query optimization workflow, with an additional`
	`85`	`+ idea that the importance of keeping bloat low in production becomes higher.`
	`86`	+- After dump/restore you must explicitly run`ANALYZE` (or`vacuumdb --analyze -j <number of workers>`) to initially
	`87`	+ gather statistics in`pg_statistic`, because`pg_restore` (or`psql`) won't run it for you.
	`88`	`+- If the database content needs to be changed to remove sensitive data, this most certainly is going to affect the`
	`89`	`+ planner behavior. For some queries, the impact may be quite low, but for others it can be critical, making query`
	`90`	`+ optimization virtually impossible. These negative effects are generally grater than those caused by dump/restore`
	`91`	`+ losing bloat because:`
	`92`	+- dump/restore affects`relpages` (bloat lost) and the order of tuples, but not the content of`pg_statistic`
	`93`	`+- removal of sensitive data can not only reorder tuples, produce irrelevant bloat ("bad bloat"), but also lose`
	`94`	+ important parts of production's`pg_statistic`.
	`95`	`+`
	`96`	`+##Summary`
	`97`	`+`
	`98`	`+To make the non-production Postgres planner behave like in production, perform these two steps:`
	`99`	`+`
	`100`	+1) Tune certain non-prod Postgres settings (planner-related,`work_mem`) to match production.
	`101`	`+`
	`102`	`+2) Copy the database from production if possible, preferring physical type of provisioning to logical, and – if it is`
	`103`	`+ possible, of course – avoiding data modifications. For ultra-fast delivery of clones, consider using the DBLab Engine`
	`104`	`+[@Database_Lab](https://twitter.com/Database_Lab).`

`‎README.md`

Lines changed: 1 addition & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -82,6 +82,7 @@ As an example, first 2 rows:`
`82`	`82`	`- 0053[Index maintenance](./0053_index_maintenance.md)`
`83`	`83`	`- 0054[How to check btree indexes for corruption (pg_amcheck)](./0054_how_to_check_btree_indexes_for_corruption.md)`
`84`	`84`	`- 0055[How to drop a column](./0055_how_to_drop_a_column.md)`
	`85`	`+- 0056[How to make the non-production Postgres planner behave like in production](./0056_how_to_imitate_production_planner.md)`
`85`	`86`	`- ...`
`86`	`87`
`87`	`88`	`##Contributors`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit433f116

File tree

2 files changed

2 files changed

`‎0056_how_to_imitate_production_planner.md`

`‎README.md`

0 commit comments