Commit6c45a97

msdousti

authored and

NikolayS

committed

Days 63-64

1 parent8c6ab30 commit6c45a97Copy full SHA for 6c45a97

File tree

4 files changed

+267

-0

lines changed

4 files changed

+267

-0

lines changed

`‎0060_how_to_add_a_column.md`

Lines changed: 29 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -168,3 +168,32 @@ and we still need to backfill. This has to be done in batches, to avoid long-las`
`168`	`168`	```sql
`169`	`169`	`dropindex concurrently i_t1_id_new;`
`170`	`170`	```
	`171`	`+`
	`172`	`+-----`
	`173`	`+`
	`174`	`+##Correction regarding the internals of how DEFAULT values are stored`
	`175`	`+`
	`176`	+-`pg_attrdef` stores all current defaults. When we change`DEFAULT` for an existing column, this catalog is updated to
	`177`	`+ store the new value:`
	`178`	`+`
	`179`	+```sql
	`180`	`+ nik=# alter table t1 alter column c2 set default -30;`
	`181`	`+ALTERTABLE`
	`182`	`+`
	`183`	`+ nik=# select pg_get_expr(adbin, 't1'::regclass::oid) from pg_attrdef;`
	`184`	`+ pg_get_expr`
	`185`	`+----------------`
	`186`	`+'-30'::integer`
	`187`	`+ (1 row)`
	`188`	+```
	`189`	`+`
	`190`	+ And the value stored in`pg_attribute` in`attmissingval` is that one that is used for the rows that existed before
	`191`	`+ column was created:`
	`192`	`+`
	`193`	+```sql
	`194`	`+ nik=# select attmissingval from pg_attribute where attrelid = 't1'::regclass::oid and attname = 'c2';`
	`195`	`+ attmissingval`
	`196`	`+---------------`
	`197`	`+ {-10}`
	`198`	`+ (1 row)`
	`199`	+```

`‎0063_how_to_help_others.md`

Lines changed: 86 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,86 @@`
	`1`	`+Originally from:[tweet](https://twitter.com/samokhvalov/status/1729321940496924977),[LinkedIn post]().`
	`2`	`+`
	`3`	`+---`
	`4`	`+`
	`5`	`+#How to help others`
	`6`	`+`
	`7`	`+>I post a new PostgreSQL "howto" article every day. Join me in this`
	`8`	`+>journey –[subscribe](https://twitter.com/samokhvalov/), provide feedback, share!`
	`9`	`+`
	`10`	`+Two key principles of helping others (it can be internal help to your teammates, or external consulting, doesn't`
	`11`	`+matter):`
	`12`	`+`
	`13`	`+1. Rely on good sources.`
	`14`	`+2. Test everything.`
	`15`	`+`
	`16`	`+##Principle 1. Good sources`
	`17`	`+`
	`18`	`+Rely on high-quality sources that you trust. For example, my point of view, ranking the quality/trustworthiness level,`
	`19`	`+roughly:`
	`20`	`+`
	`21`	`+- PG docs – 9/10`
	`22`	`+- PG source code – 10/10 (source of truth!)`
	`23`	`+- StackOverflow random answers – 5/10 or lower (excluding`
	`24`	`+[Erwin Brandstetter](https://stackoverflow.com/users/939860/erwin-brandstetter),[Laurenz Albe](https://stackoverflow.com/users/6464308/laurenz-albe),[Peter Eisentraut](https://stackoverflow.com/users/98530/peter-eisentraut) –`
	`25`	`+ these`
	`26`	`+ guys rock when they answer there, 8/10 or higher)`
	`27`	`+- the same for random blog posts`
	`28`	`+- both "Internals" books by[Suzuki](https://www.interdb.jp/pg/)`
	`29`	`+ and[Rogov](https://postgrespro.com/blog/pgsql/5969637) – 8/10 or higher`
	`30`	`+- etc.`
	`31`	`+`
	`32`	`+And (important!) always provide the link to your sources; you get two benefits from this:`
	`33`	`+`
	`34`	`+- advertise good source and pay them back;`
	`35`	`+- share the responsibility to some extent (very helpful if you are not very experienced yet; everyone might make a`
	`36`	`+ mistake).`
	`37`	`+`
	`38`	`+##Principle 2. Verification – database experiments`
	`39`	`+`
	`40`	`+Always doubt everything, don't trust language models, regardless of their nature.`
	`41`	`+`
	`42`	`+If someone (including you or me) says something without verification via an experiment (test), it needs to be fixed –`
	`43`	`+using an experiment.`
	`44`	`+`
	`45`	`+All decisions should be made based on data – the reliable data is gathered via testing.`
	`46`	`+`
	`47`	`+Most database-related ideas are to be verified using database experiments.`
	`48`	`+`
	`49`	`+Two types of database experiments:`
	`50`	`+`
	`51`	+1. Multi-session experiments – full-fledged benchmarks like those conducted using`pgbench`, JMeter,`sysbench`,
	`52`	+`pgreplay-go`, etc.
	`53`	`+`
	`54`	`+ They aim to study the behavior of Postgres as a whole, all its components, and must be performed on dedicated`
	`55`	`+ resources where nobody else is doing any work. Environment should match production (VM and disk type, PG version,`
	`56`	`+ settings).`
	`57`	`+`
	`58`	`+ Examples of such experiments include load testing, stress testing, performance regression testing. Main tools are`
	`59`	+ those that aim to study macro-level query analysis:`pgss`, wait event analysis (aka active session history or
	`60`	+ performance/query insights),`auto_explain`,`pgBadger`, etc.
	`61`	`+`
	`62`	`+ More about this type of experiments:[Day 13: How to benchmark](0013_how_to_benchmark.md).`
	`63`	`+`
	`64`	`+2. Single-session experiments – testing one or a sequence of SQL queries using a single session (sometimes, two), to`
	`65`	`+ check query syntax, study individual query behavior, optimize particular query, etc.`
	`66`	`+`
	`67`	`+ These experiments can be conducted in shared environments, on weaker machines. However, to study query performance,`
	`68`	`+ you need to have the same PG version, same or similar database, and matching planner settings (how to do it:`
	`69`	`+[Day 56: How to make the non-production Postgres planner behave like in production](0056_how_to_imitate_production_planner.md)).`
	`70`	`+ In this case, it should be kept in mind that`
	`71`	`+ timing metrics might be significantly off compared to production, and the main attention in the query optimization`
	`72`	`+ process should be paid to execution plans and data volumes (actual rows, buffer operation counts provided by the`
	`73`	+ option`BUFFERS` in`EXPLAIN`).
	`74`	`+`
	`75`	`+ Examples of such experiments: checking syntax and logical behavior of a sequence of SQL queries, query performance`
	`76`	+ analysis`EXPLAIN (ANALYZE, BUFFERS)` and testing optimization ideas, schema change testing, etc. It is very helpful
	`77`	`+ to be able to quickly clone large databases not paying extra money for storage (Neon, Amazon Aurora). And it's even`
	`78`	`+ better if no extra money is paid for both storage and compute, this truly unlocks testing activities including`
	`79`	`+ automated tests in CI/CD (DBLab Engine[@Database_Lab](https://twitter.com/Database_Lab)).`
	`80`	`+`
	`81`	`+##Summary`
	`82`	`+`
	`83`	`+As you can see, the principles are extremely simple:`
	`84`	`+`
	`85`	`+- read good papers, and`
	`86`	`+- don't blindly trust – test everything.`

`‎0064_how_to_use_uuid.md`

Lines changed: 150 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,150 @@`
	`1`	`+Originally from:[tweet](https://twitter.com/samokhvalov/status/1729900720189759801),[LinkedIn post]().`
	`2`	`+`
	`3`	`+---`
	`4`	`+`
	`5`	`+#How to use UUID`
	`6`	`+`
	`7`	`+>I post a new PostgreSQL "howto" article every day. Join me in this`
	`8`	`+>journey –[subscribe](https://twitter.com/samokhvalov/), provide feedback, share!`
	`9`	`+`
	`10`	`+As of now (PG16, 2023), Postgres implements UUID versions from 1 to 5, based`
	`11`	`+on[RFC 4122](https://datatracker.ietf.org/doc/html/rfc4122).`
	`12`	`+`
	`13`	`+- Docs:[UUID Data Type](https://postgresql.org/docs/current/datatype-uuid.html)`
	`14`	`+- Additional module[uuid-ossp](https://postgresql.org/docs/current/uuid-ossp.html)`
	`15`	`+`
	`16`	+A UUID value can be generated using`get_random_uuid()`, it generates UUID version 4
	`17`	`+([source code for PG16](https://github.com/postgres/postgres/blob/03749325d16c4215ecd6d6a6fe117d93931d84aa/src/backend/utils/adt/uuid.c#L405-L423)):`
	`18`	`+`
	`19`	+```sql
	`20`	`+nik=# select gen_random_uuid();`
	`21`	`+ gen_random_uuid`
	`22`	`+--------------------------------------`
	`23`	`+ c027497b-c510-413b-9092-8e6c99cf9596`
	`24`	`+(1 row)`
	`25`	`+`
	`26`	`+nik=# select gen_random_uuid();`
	`27`	`+ gen_random_uuid`
	`28`	`+--------------------------------------`
	`29`	`+ 08e63fed-f883-45d8-9896-8f087074bff5`
	`30`	`+(1 row)`
	`31`	+```
	`32`	`+`
	`33`	`+In standard UUIDs, the version can be understood looking at the first character after the 2nd hyphen:`
	`34`	`+`
	`35`	`+<pre>`
	`36`	`+08e63fed-f883-<ustyle="color:darkmagenta;font-weight:bolder">4</u> ... 👈 this means v4`
	`37`	`+</pre>`
	`38`	`+`
	`39`	`+The values are coming in a "pseudorandom" order. This has certain negative impact on performance: in a B-tree index,`
	`40`	`+inserts happen in various locations, which, in general, affects write performance, as well as performance of Top-N`
	`41`	`+reads (selecting N latest rows).`
	`42`	`+`
	`43`	`+There is a proposal to implement newer versions of UUID both in RFC and Postgres – v7 provides a time-based UUID that`
	`44`	`+includes a millisecond-precision timestamp, sequence number, and additional entropy in the form of random or fixed bits.`
	`45`	`+This kind of UUID not only ensures global uniqueness but also preserves the temporal aspect, which can be very`
	`46`	`+beneficial for performance.`
	`47`	`+`
	`48`	`+-[Commitfest: UUID v7](https://commitfest.postgresql.org/45/4388/)`
	`49`	`+-[rfc4122bis proposal](https://datatracker.ietf.org/doc/draft-ietf-uuidrev-rfc4122bis/)`
	`50`	`+`
	`51`	+UUID values are 16-byte – the same as`timestamptz` or`timestamp` values.
	`52`	`+`
	`53`	`+Good materials explaining performance aspects:`
	`54`	`+`
	`55`	`+-[The effect of Random UUID on database performance](https://twitter.com/hnasr/status/1695270411481796868) (video,~19`
	`56`	`+ min) by[@hnasr](https://twitter.com/hnasr)`
	`57`	`+`
	`58`	`+-[Identity Crisis: Sequence v. UUID as Primary Key](https://brandur.org/nanoglyphs/026-ids#ulids) by`
	`59`	`+[@brandur](https://twitter.com/brandur)`
	`60`	`+`
	`61`	`+Since Postgres doesn't support UUID v7 natively yet, there are two options to use them`
	`62`	`+`
	`63`	`+- generate on client side`
	`64`	`+- implement a helper function in Postgres.`
	`65`	`+`
	`66`	`+For the latter approach, here is[SQL function](https://gist.github.com/kjmph/5bd772b2c2df145aa645b837da7eca74)`
	`67`	`+(thanks[@DanielVerite](https://twitter.com/DanielVerite)):`
	`68`	`+`
	`69`	+```sql
	`70`	`+create or replacefunctionuuid_generate_v7() returns uuid`
	`71`	`+as $$`
	`72`	`+-- use random v4 uuid as starting point (which has the same variant we need)`
	`73`	`+-- then overlay timestamp`
	`74`	`+-- then set version 7 by flipping the 2 and 1 bit in the version 4 string`
	`75`	`+select encode(`
	`76`	`+ set_bit(`
	`77`	`+ set_bit(`
	`78`	`+ overlay(`
	`79`	`+ uuid_send(gen_random_uuid())`
	`80`	`+ placingsubstring(int8send(floor(extract(epochfrom clock_timestamp())*1000)::bigint)from3)`
	`81`	`+from1 for6`
	`82`	`+ ),`
	`83`	`+52,1`
	`84`	`+ ),`
	`85`	`+53,1`
	`86`	`+ ),`
	`87`	`+'hex')::uuid;`
	`88`	`+$$ language SQL volatile;`
	`89`	+```
	`90`	`+`
	`91`	`+Examples:`
	`92`	`+`
	`93`	+```sql
	`94`	`+nik=# select uuid_generate_v7();`
	`95`	`+ uuid_generate_v7`
	`96`	`+--------------------------------------`
	`97`	`+ 018c1be3-e485-7252-b80f-76a71843466a`
	`98`	`+(1 row)`
	`99`	`+`
	`100`	`+nik=# select uuid_generate_v7();`
	`101`	`+ uuid_generate_v7`
	`102`	`+--------------------------------------`
	`103`	`+ 018c1be3-e767-76b9-93dc-23c0c48be6c7`
	`104`	`+(1 row)`
	`105`	`+`
	`106`	`+nik=# select uuid_generate_v7();`
	`107`	`+ uuid_generate_v7`
	`108`	`+--------------------------------------`
	`109`	`+ 018c1be3-e973-7704-82ad-5967b79cf5c4`
	`110`	`+(1 row)`
	`111`	+```
	`112`	`+`
	`113`	`+After a few minutes:`
	`114`	`+`
	`115`	+```sql
	`116`	`+nik=# select uuid_generate_v7();`
	`117`	`+ uuid_generate_v7`
	`118`	`+--------------------------------------`
	`119`	`+ 018c1be8-5002-70ab-96c0-c96ad5afa151`
	`120`	`+(1 row)`
	`121`	+```
	`122`	`+`
	`123`	`+A few notes:`
	`124`	`+`
	`125`	+1) If you use these value in the`ORDER BY` clause, the chronological order will persist.
	`126`	`+`
	`127`	+2) For the first 3 values (that we generated during a few seconds) there is a common prefix,`018c1be3-e`, and with the
	`128`	+ last value that was generated slightly later, there is common prefix`018c1be`.
	`129`	`+`
	`130`	+3) Note`7` after the second hyphen in all values:
	`131`	`+ <pre>`
	`132`	`+ 018c1be3-e973-<ustyle="color:darkmagenta;font-weight:bolder">7</u>... 👈 this means v7`
	`133`	`+ </pre>`
	`134`	`+`
	`135`	`+4) The function returns a value of the UUID type, so it's still 16-byte (while text representation of it would take 36`
	`136`	+ characters including hyphens, meaning 40 bytes total with`VARLENA` header):
	`137`	`+`
	`138`	+```sql
	`139`	`+ nik=# select pg_column_size(gen_random_uuid());`
	`140`	`+ pg_column_size`
	`141`	`+----------------`
	`142`	`+16`
	`143`	`+ (1 row)`
	`144`	`+`
	`145`	`+ nik=# select pg_column_size(uuid_generate_v7());`
	`146`	`+ pg_column_size`
	`147`	`+----------------`
	`148`	`+16`
	`149`	`+ (1 row)`
	`150`	+```

`‎README.md`

Lines changed: 2 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -89,6 +89,8 @@ As an example, first 2 rows:`
`89`	`89`	`- 0060[How to add a column](./0060_how_to_add_a_column.md)`
`90`	`90`	`- 0061[How to create an index, part 1](./0061_how_to_create_an_index_part_1.md)`
`91`	`91`	`- 0062[How to create an index, part 2](./0062_how_to_create_an_index_part_2.md)`
	`92`	`+- 0063[How to help others](./0063_how_to_help_others.md)`
	`93`	`+- 0064[How to use UUID](./0064_how_to_use_uuid.md)`
`92`	`94`	`- ...`
`93`	`95`
`94`	`96`	`##Contributors`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit6c45a97

File tree

4 files changed

4 files changed

`‎0060_how_to_add_a_column.md`

`‎0063_how_to_help_others.md`

`‎0064_how_to_use_uuid.md`

`‎README.md`

0 commit comments