Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitff78bf7

Browse files
committed
Block signals while allocating DSM memory.
On Linux, we call posix_fallocate() on shm_open()'d memory to avoidlater potential SIGBUS (see commit899bd78).Based on field reports of systems stuck in an EINTR retry loop there,there, we made it possible to break out of that loop via slightly oddcoding where the CHECK_FOR_INTERRUPTS() call was somewhat removed fromthe loop (see commit422952e).On further reflection, that was not a great choice for at least tworeasons:1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothingand the EINTR error would be surfaced to the user.2. If EINTR was reported but neither QueryCancelPending norProcDiePending was set, then we'd dutifully retry, but with a bit moreunderstanding of how posix_fallocate() works, it's now clear that youcan get into a loop that never terminates. posix_fallocate() is not afunction that can do some of the job and tell you about progress if it'sinterrupted, it has to undo what it's done so far and report EINTR, andif signals keep arriving faster than it can complete (cf recoveryconflict signals), you're stuck.Therefore, for now, we'll simply block most signals to guaranteeprogress. SIGQUIT is not blocked (see InitPostmasterChild()), becauseits expected handler doesn't return, and unblockable signals likeSIGCONT are not expected to arrive at a high rate. For good measure,we'll include the ftruncate() call in the blocked region, and add aretry loop.Back-patch to all supported releases.Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org>Reported-by: Nicola Contu <nicola.contu@gmail.com>Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>Reviewed-by: Andres Freund <andres@anarazel.de>Discussion:https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
1 parent4f88dba commitff78bf7

File tree

1 file changed

+23
-14
lines changed

1 file changed

+23
-14
lines changed

‎src/backend/storage/ipc/dsm_impl.c

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,10 @@
6161
#ifdefHAVE_SYS_SHM_H
6262
#include<sys/shm.h>
6363
#endif
64+
6465
#include"common/file_perm.h"
66+
#include"libpq/pqsignal.h"/* for PG_SETMASK macro */
6567
#include"pgstat.h"
66-
6768
#include"portability/mem.h"
6869
#include"storage/dsm_impl.h"
6970
#include"storage/fd.h"
@@ -298,14 +299,6 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
298299
shm_unlink(name);
299300
errno=save_errno;
300301

301-
/*
302-
* If we received a query cancel or termination signal, we will have
303-
* EINTR set here. If the caller said that errors are OK here, check
304-
* for interrupts immediately.
305-
*/
306-
if (errno==EINTR&&elevel >=ERROR)
307-
CHECK_FOR_INTERRUPTS();
308-
309302
ereport(elevel,
310303
(errcode_for_dynamic_shared_memory(),
311304
errmsg("could not resize shared memory segment \"%s\" to %zu bytes: %m",
@@ -351,9 +344,21 @@ static int
351344
dsm_impl_posix_resize(intfd,off_tsize)
352345
{
353346
intrc;
347+
intsave_errno;
348+
349+
/*
350+
* Block all blockable signals, except SIGQUIT. posix_fallocate() can run
351+
* for quite a long time, and is an all-or-nothing operation. If we
352+
* allowed SIGUSR1 to interrupt us repeatedly (for example, due to recovery
353+
* conflicts), the retry loop might never succeed.
354+
*/
355+
PG_SETMASK(&BlockSig);
354356

355357
/* Truncate (or extend) the file to the requested size. */
356-
rc=ftruncate(fd,size);
358+
do
359+
{
360+
rc=ftruncate(fd,size);
361+
}while (rc<0&&errno==EINTR);
357362

358363
/*
359364
* On Linux, a shm_open fd is backed by a tmpfs file. After resizing with
@@ -367,14 +372,14 @@ dsm_impl_posix_resize(int fd, off_t size)
367372
if (rc==0)
368373
{
369374
/*
370-
* Wemay get interrupted. If so, justretryunless there is an
371-
*interrupt pending. This avoids the possibility of looping forever
372-
*if another backend is repeatedly trying to interrupt us.
375+
* Westill use a traditional EINTRretryloop to handle SIGCONT.
376+
*posix_fallocate() doesn't restart automatically, and we don't want
377+
*this to fail if you attach a debugger.
373378
*/
374379
do
375380
{
376381
rc=posix_fallocate(fd,0,size);
377-
}while (rc==EINTR&& !(ProcDiePending||QueryCancelPending));
382+
}while (rc==EINTR);
378383

379384
/*
380385
* The caller expects errno to be set, but posix_fallocate() doesn't
@@ -385,6 +390,10 @@ dsm_impl_posix_resize(int fd, off_t size)
385390
}
386391
#endif/* HAVE_POSIX_FALLOCATE && __linux__ */
387392

393+
save_errno=errno;
394+
PG_SETMASK(&UnBlockSig);
395+
errno=save_errno;
396+
388397
returnrc;
389398
}
390399

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp