forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commitf192e1b
committed
Fix ordering of XIDs in ProcArrayApplyRecoveryInfo
Commit8431e29 reworked ProcArrayApplyRecoveryInfo to sort XIDsbefore adding them to KnownAssignedXids. But the XIDs are sorted usingxidComparator, which compares the XIDs simply as uint32 values, notlogically. KnownAssignedXidsAdd() however expects XIDs in logical order,and calls TransactionIdFollowsOrEquals() to enforce that. If there areXIDs for which the two orderings disagree, an error is raised and therecovery fails/restarts.Hitting this issue is fairly easy - you just need two transactions, onestarted before the 4B limit (e.g. XID 4294967290), the other sometimeafter it (e.g. XID 1000). Logically (4294967290 <= 1000) but whencompared using xidComparator we try to add them in the opposite order.Which makes KnownAssignedXidsAdd() fail with an error like this: ERROR: out-of-order XID insertion in KnownAssignedXidsThis only happens during replica startup, while processing RUNNING_XACTSrecords to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, weskip these records. So this does not affect already running replicas,but if you restart (or create) a replica while there are transactionswith XIDs for which the two orderings disagree, you may hit this.Long-running transactions and frequent replica restarts increase thelikelihood of hitting this issue. Once the replica gets into this state,it can't be started (even if the old transactions are terminated).Fixed by sorting the XIDs logically - this is fine because we're dealingwith normal XIDs (because it's XIDs assigned to backends) and from thesame wraparound epoch (otherwise the backends could not be running atthe same time on the primary node). So there are no problems with thetriangle inequality, which is why xidComparator compares raw values.Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.This issue is present in all releases since 9.4, however releases up to9.6 are EOL already so backpatch to 10 only.Reviewed-by: Abhijit Menon-SenReviewed-by: Alvaro HerreraBackpatch-through: 10Discussion:https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com1 parentc9cfc86 commitf192e1b
File tree
3 files changed
+33
-1
lines changed- src
- backend
- storage/ipc
- utils/adt
- include/utils
3 files changed
+33
-1
lines changedLines changed: 6 additions & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1164 | 1164 |
| |
1165 | 1165 |
| |
1166 | 1166 |
| |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
1167 | 1172 |
| |
1168 |
| - | |
| 1173 | + | |
1169 | 1174 |
| |
1170 | 1175 |
| |
1171 | 1176 |
| |
|
Lines changed: 26 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
145 | 145 |
| |
146 | 146 |
| |
147 | 147 |
| |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
148 | 174 |
| |
149 | 175 |
| |
150 | 176 |
| |
|
Lines changed: 1 addition & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
87 | 87 |
| |
88 | 88 |
| |
89 | 89 |
| |
| 90 | + | |
90 | 91 |
| |
91 | 92 |
| |
92 | 93 |
| |
|
0 commit comments
Comments
(0)