Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit0606691

Browse files
committed
Fix waitpid() emulation on Windows.
Our waitpid() emulation didn't prevent a PID from being recycled by theOS before the call to waitpid(). The postmaster could finish uptracking more than one child process with the same PID, and confusethem.Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(),so that resources are released synchronously. The process and PIDcontinue to exist until we close the process handle, which only happensonce we're ready to adjust our book-keeping of running children.This seems to explain a couple of failures on CI. It had never beenreported before, despite the code being as old as the Windows port.Perhaps Windows started recycling PIDs more rapidly, or perhaps timingchanges due to commit7389aad made it more likely to break.Thanks to Alexander Lakhin for analysis and Andres Freund for trackingdown the root cause.Back-patch to all supported branches.Reported-by: Andres Freund <andres@anarazel.de>Discussion:https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
1 parenta67c75f commit0606691

File tree

1 file changed

+40
-30
lines changed

1 file changed

+40
-30
lines changed

‎src/backend/postmaster/postmaster.c

Lines changed: 40 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -4861,7 +4861,7 @@ internal_forkexec(int argc, char *argv[], Port *port)
48614861
(errmsg_internal("could not register process for wait: error code %lu",
48624862
GetLastError())));
48634863

4864-
/* Don't close pi.hProcess here -the wait thread needs access to it */
4864+
/* Don't close pi.hProcess here -waitpid() needs access to it */
48654865

48664866
CloseHandle(pi.hThread);
48674867

@@ -6481,36 +6481,21 @@ ShmemBackendArrayRemove(Backend *bn)
64816481
staticpid_t
64826482
waitpid(pid_tpid,int*exitstatus,intoptions)
64836483
{
6484+
win32_deadchild_waitinfo*childinfo;
6485+
DWORDexitcode;
64846486
DWORDdwd;
64856487
ULONG_PTRkey;
64866488
OVERLAPPED*ovl;
64876489

6488-
/*
6489-
* Check if there are any dead children. If there are, return the pid of
6490-
* the first one that died.
6491-
*/
6492-
if (GetQueuedCompletionStatus(win32ChildQueue,&dwd,&key,&ovl,0))
6490+
/* Try to consume one win32_deadchild_waitinfo from the queue. */
6491+
if (!GetQueuedCompletionStatus(win32ChildQueue,&dwd,&key,&ovl,0))
64936492
{
6494-
*exitstatus=(int)key;
6495-
returndwd;
6493+
errno=EAGAIN;
6494+
return-1;
64966495
}
64976496

6498-
return-1;
6499-
}
6500-
6501-
/*
6502-
* Note! Code below executes on a thread pool! All operations must
6503-
* be thread safe! Note that elog() and friends must *not* be used.
6504-
*/
6505-
staticvoidWINAPI
6506-
pgwin32_deadchild_callback(PVOIDlpParameter,BOOLEANTimerOrWaitFired)
6507-
{
6508-
win32_deadchild_waitinfo*childinfo= (win32_deadchild_waitinfo*)lpParameter;
6509-
DWORDexitcode;
6510-
6511-
if (TimerOrWaitFired)
6512-
return;/* timeout. Should never happen, since we use
6513-
* INFINITE as timeout value. */
6497+
childinfo= (win32_deadchild_waitinfo*)key;
6498+
pid=childinfo->procId;
65146499

65156500
/*
65166501
* Remove handle from wait - required even though it's set to wait only
@@ -6526,13 +6511,11 @@ pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired)
65266511
write_stderr("could not read exit code for process\n");
65276512
exitcode=255;
65286513
}
6529-
6530-
if (!PostQueuedCompletionStatus(win32ChildQueue,childinfo->procId, (ULONG_PTR)exitcode,NULL))
6531-
write_stderr("could not post child completion status\n");
6514+
*exitstatus=exitcode;
65326515

65336516
/*
6534-
*Handle is per-process, so we close it here instead of inthe
6535-
*originating thread
6517+
*Close theprocess handle. Only after this point canthe PID can be
6518+
*recycled by the kernel.
65366519
*/
65376520
CloseHandle(childinfo->procHandle);
65386521

@@ -6542,7 +6525,34 @@ pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired)
65426525
*/
65436526
free(childinfo);
65446527

6545-
/* Queue SIGCHLD signal */
6528+
returnpid;
6529+
}
6530+
6531+
/*
6532+
* Note! Code below executes on a thread pool! All operations must
6533+
* be thread safe! Note that elog() and friends must *not* be used.
6534+
*/
6535+
staticvoidWINAPI
6536+
pgwin32_deadchild_callback(PVOIDlpParameter,BOOLEANTimerOrWaitFired)
6537+
{
6538+
/* Should never happen, since we use INFINITE as timeout value. */
6539+
if (TimerOrWaitFired)
6540+
return;
6541+
6542+
/*
6543+
* Post the win32_deadchild_waitinfo object for waitpid() to deal with. If
6544+
* that fails, we leak the object, but we also leak a whole process and
6545+
* get into an unrecoverable state, so there's not much point in worrying
6546+
* about that. We'd like to panic, but we can't use that infrastructure
6547+
* from this thread.
6548+
*/
6549+
if (!PostQueuedCompletionStatus(win32ChildQueue,
6550+
0,
6551+
(ULONG_PTR)lpParameter,
6552+
NULL))
6553+
write_stderr("could not post child completion status\n");
6554+
6555+
/* Queue SIGCHLD signal. */
65466556
pg_queue_signal(SIGCHLD);
65476557
}
65486558
#endif/* WIN32 */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp