forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commitee32782
committed
Fix postmaster state machine to handle dead_end child crashes better.
A report from Alvaro Herrera shows that if we're in PM_STARTUPstate, and we spawn a dead_end child to reject some incomingconnection request, and that child dies with an unexpected exitcode, the postmaster does not respond well. We correctly sendSIGQUIT to the startup process, but then:* if the startup process exits with nonzero exit code, as expected,we thought that that indicated a crash and aborted startup.* if the startup process exits with zero exit code, which is possibledue to the inherent race condition, we'd advance to PM_RUN statewhich is fine --- but the code forgot that AbortStartTime would benonzero in this situation. We'd either die on the Asserts sayingthat it was zero, or perhaps misbehave later on. (A quick looksuggests that the only misbehavior might be busy-waiting due toDetermineSleepTime doing the wrong thing.)To fix the first point, adjust the state-machine logic to recognizethat a nonzero exit code is expected after sending SIGQUIT, and haveit transition to a state where we can restart the startup process.To fix the second point, change the Asserts to clear the variablerather than just claiming it should be clear already.Perhaps we could improve this further by not treating a crash ofa dead_end child as a reason for panic'ing the database. However,since those child processes are connected to shared memory, thatseems a bit risky. There are few good reasons for a dead_end childto report failure anyway (the cause of this in Alvaro's report isquite unclear). On balance, therefore, a minimal fix seems best.This is an oversight in commit45811be. While that was back-patched,I'm hesitant to back-patch this change. The lack of reasons for adead_end child to fail suggests that the case should be very rare inthe field, which squares with the lack of reports; so it seems likethis might not be worth the risk of introducing new issues. In anycase we can let it bake awhile in HEAD before considering a back-patch.Discussion:https://postgr.es/m/20190615160950.GA31378@alvherre.pgsql1 parent348778d commitee32782
1 file changed
+19
-4
lines changedLines changed: 19 additions & 4 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
2920 | 2920 |
| |
2921 | 2921 |
| |
2922 | 2922 |
| |
2923 |
| - | |
| 2923 | + | |
| 2924 | + | |
| 2925 | + | |
2924 | 2926 |
| |
2925 | 2927 |
| |
2926 | 2928 |
| |
| |||
2937 | 2939 |
| |
2938 | 2940 |
| |
2939 | 2941 |
| |
| 2942 | + | |
| 2943 | + | |
| 2944 | + | |
| 2945 | + | |
| 2946 | + | |
| 2947 | + | |
| 2948 | + | |
| 2949 | + | |
| 2950 | + | |
2940 | 2951 |
| |
2941 | 2952 |
| |
2942 | 2953 |
| |
2943 | 2954 |
| |
| 2955 | + | |
2944 | 2956 |
| |
| 2957 | + | |
| 2958 | + | |
| 2959 | + | |
2945 | 2960 |
| |
2946 | 2961 |
| |
2947 | 2962 |
| |
| |||
2954 | 2969 |
| |
2955 | 2970 |
| |
2956 | 2971 |
| |
2957 |
| - | |
| 2972 | + | |
2958 | 2973 |
| |
2959 | 2974 |
| |
2960 | 2975 |
| |
| |||
3504 | 3519 |
| |
3505 | 3520 |
| |
3506 | 3521 |
| |
3507 |
| - | |
| 3522 | + | |
3508 | 3523 |
| |
3509 | 3524 |
| |
3510 | 3525 |
| |
| |||
5100 | 5115 |
| |
5101 | 5116 |
| |
5102 | 5117 |
| |
5103 |
| - | |
| 5118 | + | |
5104 | 5119 |
| |
5105 | 5120 |
| |
5106 | 5121 |
| |
|
0 commit comments
Comments
(0)