- Notifications
You must be signed in to change notification settings - Fork28
Commit70b4f82
committed
Prevent hard failures of standbys caused by recycled WAL segments
When a standby's WAL receiver stops reading WAL from a WAL stream, itwrites data to the current WAL segment without having priorily zero'edthe page currently written to, which can cause the WAL reader to readjunk data from a past recycled segment and then it would try to get arecord from it. While sanity checks in place provide most of theprotection needed, in some rare circumstances, with chances increasingwhen a record header crosses a page boundary, then the startup processcould fail violently on an allocation failure, as follows:FATAL: invalid memory alloc request size XXXThis is confusing for the user and also unhelpful as this requires inthe worst case a manual restart of the instance, impacting potentiallythe availability of the cluster, and this also makes WAL data look likeit is in a corrupted state.The chances of seeing failures are higher if the connection between thestandby and its root node is unstable, causing WAL pages to be writtenin the middle. A couple of approaches have been discussed, likezero-ing new WAL pages within the WAL receiver itself but this has thedisadvantage of impacting performance of any existing instances as thisbreaks the sequential writes done by the WAL receiver. This commitdeals with the problem with a more simple approach, which has noperformance impact without reducing the detection of the problem: if arecord is found with a length higher than 1GB for backends, then do nottry any allocation and report a soft failure which will force thestandby to retry reading WAL. It could be possible that the allocationcall passes and that an unnecessary amount of memory is allocated,however follow-up checks on records would just fail, making thisallocation short-lived anyway.This patch owes a great deal to Tsunakawa Takayuki for reporting thefailure first, and then discussing a couple of potential approaches tothe problem.Backpatch down to 9.5, which is where palloc_extended has beenintroduced.Reported-by: Tsunakawa TakayukiReviewed-by: Tsunakawa TakayukiAuthor: Michael PaquierDiscussion:https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT051 parent9b53d96 commit70b4f82
1 file changed
+23
-0
lines changedLines changed: 23 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
25 | 25 |
| |
26 | 26 |
| |
27 | 27 |
| |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
28 | 32 |
| |
29 | 33 |
| |
30 | 34 |
| |
| |||
160 | 164 |
| |
161 | 165 |
| |
162 | 166 |
| |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
163 | 186 |
| |
164 | 187 |
| |
165 | 188 |
| |
|
0 commit comments
Comments
(0)