NotificationsYou must be signed in to change notification settings
Fork28
Star151

Commite6dc04d

committed

Fix more race conditions in the newly-added pg_rewind test.

pg_rewind looks at the control file to check what timeline a server is on.But promotion doesn't immediately write a checkpoint, it merely writesan end-of-recovery WAL record. If pg_rewind runs immediately afterpromotion, before the checkpoint has completed, it will think think thatthe server is still on the earlier timeline. We ran into this issue a longtime ago already, see commit484a848.It's a bit bogus that pg_rewind doesn't determine the timeline correctlyuntil the end-of-recovery checkpoint has completed. We probably shouldfix that. But for now work around it by waiting for the checkpointto complete before running pg_rewind, like we did in commit484a848.In the passing, tidy up the new test a little bit. Rerder the INSERTs sothat the comments make more sense, remove a spurious CHECKPOINT call afterpg_rewind has already run, and add --debug option, so that if this failsagain, we'll have more data.Per buildfarm failure athttps://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual&dt=2020-12-06%2018%3A32%3A19&stg=pg_rewind-check.Backpatch to all supported versions.Discussion:https://www.postgresql.org/message-id/1713707e-e318-761c-d287-5b6a4aa807e8@iki.fi

1 parent7d43b76 commite6dc04dCopy full SHA for e6dc04d

File tree

1 file changed

+15

-7

lines changed

src/bin/pg_rewind/t
- 008_min_recovery_point.pl

1 file changed

+15

-7

lines changed

`‎src/bin/pg_rewind/t/008_min_recovery_point.pl`

Lines changed: 15 additions & 7 deletions

Original file line number	Diff line number	Diff line change
`@@ -75,6 +75,13 @@`
`75`	`75`	`#`
`76`	`76`	`$node_1->stop('fast');`
`77`	`77`	`$node_3->promote;`
	`78`	`+# Force a checkpoint after the promotion. pg_rewind looks at the control`
	`79`	`+# file to determine what timeline the server is on, and that isn't updated`
	`80`	`+# immediately at promotion, but only at the next checkpoint. When running`
	`81`	`+# pg_rewind in remote mode, it's possible that we complete the test steps`
	`82`	`+# after promotion so quickly that when pg_rewind runs, the standby has not`
	`83`	`+# performed a checkpoint after promotion yet.`
	`84`	`+$node_3->safe_psql('postgres',"checkpoint");`
`78`	`85`
`79`	`86`	`# reconfigure node_1 as a standby following node_3`
`80`	`87`	`my$node_3_connstr =$node_3->connstr;`
`@@ -99,13 +106,18 @@`
`99`	`106`	`$node_3->wait_for_catchup('node_1','replay',$lsn);`
`100`	`107`
`101`	`108`	`$node_1->promote;`
	`109`	`+# Force a checkpoint after promotion, like earlier.`
	`110`	`+$node_1->safe_psql('postgres',"checkpoint");`
`102`	`111`
`103`	`112`	`#`
`104`	`113`	`# We now have a split-brain with two primaries. Insert a row on both to`
`105`	`114`	`# demonstratively create a split brain. After the rewind, we should only`
`106`	`115`	`# see the insert on 1, as the insert on node 3 is rewound away.`
`107`	`116`	`#`
`108`	`117`	`$node_1->safe_psql('postgres',"INSERT INTO public.foo (t) VALUES ('keep this')");`
	`118`	`+# 'bar' is unmodified in node 1, so it won't be overwritten by replaying the`
	`119`	`+# WAL from node 1.`
	`120`	`+$node_3->safe_psql('postgres',"INSERT INTO public.bar (t) VALUES ('rewind this')");`
`109`	`121`
`110`	`122`	`# Insert more rows in node 1, to bump up the XID counter. Otherwise, if`
`111`	`123`	`# rewind doesn't correctly rewind the changes made on the other node,`
`@@ -114,10 +126,6 @@`
`114`	`126`	`$node_1->safe_psql('postgres',"INSERT INTO public.foo (t) VALUES ('and this')");`
`115`	`127`	`$node_1->safe_psql('postgres',"INSERT INTO public.foo (t) VALUES ('and this too')");`
`116`	`128`
`117`		`-# Also insert a row in 'bar' on node 3. It is unmodified in node 1, so it won't get`
`118`		`-# overwritten by replaying the WAL from node 1.`
`119`		`-$node_3->safe_psql('postgres',"INSERT INTO public.bar (t) VALUES ('rewind this')");`
`120`		`-`
`121`	`129`	`# Wait for node 2 to catch up`
`122`	`130`	`$node_2->poll_query_until('postgres',`
`123`	`131`	`q\|SELECT COUNT(*) > 1 FROM public.bar\|,'t');`
`@@ -139,9 +147,10 @@`
`139`	`147`	`[`
`140`	`148`	`'pg_rewind',`
`141`	`149`	`"--source-server=$node_1_connstr",`
`142`		`-"--target-pgdata=$node_2_pgdata"`
	`150`	`+"--target-pgdata=$node_2_pgdata",`
	`151`	`+"--debug"`
`143`	`152`	`],`
`144`		`-'pg_rewind detects rewind needed');`
	`153`	`+'run pg_rewind');`
`145`	`154`
`146`	`155`	`# Now move back postgresql.conf with old settings`
`147`	`156`	`move(`
`@@ -153,7 +162,6 @@`
`153`	`162`	`# Check contents of the test tables after rewind. The rows inserted in node 3`
`154`	`163`	`# before rewind should've been overwritten with the data from node 1.`
`155`	`164`	`my$result;`
`156`		`-$result =$node_2->safe_psql('postgres','checkpoint');`
`157`	`165`	`$result =$node_2->safe_psql('postgres','SELECT * FROM public.foo');`
`158`	`166`	`is($result,qq(keep this`
`159`	`167`	`and this`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commite6dc04d

File tree

1 file changed

1 file changed

`‎src/bin/pg_rewind/t/008_min_recovery_point.pl`

0 commit comments