Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitcea17ba

Browse files
committed
Be more predictable about reporting "lock timeout" vs "statement timeout".
If both timeout indicators are set when we arrive at ProcessInterrupts,we've historically just reported "lock timeout". However, some buildfarmmembers have been observed to fail isolationtester's timeouts test byreporting "lock timeout" when the statement timeout was expected to firefirst. The cause seems to be that the process is allowed to sleep longerthan expected (probably due to heavy machine load) so that the locktimeout happens before we reach the point of reporting the error, andthen this arbitrary tiebreak rule does the wrong thing. We can improvematters by comparing the scheduled timeout times to decide which errorto report.I had originally proposed greatly reducing the 1-second window betweenthe two timeouts in the test cases. On reflection that is a bad idea,at least for the case where the lock timeout is expected to fire first,because that would assume that it takes negligible time to get fromstatement start to the beginning of the lock wait. Thus, this patchdoesn't completely remove the risk of test failures on slow machines.Empirically, however, the case this handles is the one we are seeingin the buildfarm. The explanation may be that the other case requiresthe scheduler to take the CPU away from a busy process, whereas thecase fixed here only requires the scheduler to not give the CPU backright away to a process that has been woken from a multi-second sleep(and, perhaps, has been swapped out meanwhile).Back-patch to 9.3 where the isolationtester timeouts test was added.Discussion: <8693.1464314819@sss.pgh.pa.us>
1 parent47e5969 commitcea17ba

File tree

3 files changed

+35
-5
lines changed

3 files changed

+35
-5
lines changed

‎src/backend/tcop/postgres.c

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2920,6 +2920,9 @@ ProcessInterrupts(void)
29202920

29212921
if (QueryCancelPending)
29222922
{
2923+
boollock_timeout_occurred;
2924+
boolstmt_timeout_occurred;
2925+
29232926
/*
29242927
* Don't allow query cancel interrupts while reading input from the
29252928
* client, because we might lose sync in the FE/BE protocol. (Die
@@ -2940,17 +2943,29 @@ ProcessInterrupts(void)
29402943

29412944
/*
29422945
* If LOCK_TIMEOUT and STATEMENT_TIMEOUT indicators are both set, we
2943-
*prefer toreport the former; but be sure to clear both.
2946+
*need toclear both, so always fetch both.
29442947
*/
2945-
if (get_timeout_indicator(LOCK_TIMEOUT, true))
2948+
lock_timeout_occurred=get_timeout_indicator(LOCK_TIMEOUT, true);
2949+
stmt_timeout_occurred=get_timeout_indicator(STATEMENT_TIMEOUT, true);
2950+
2951+
/*
2952+
* If both were set, we want to report whichever timeout completed
2953+
* earlier; this ensures consistent behavior if the machine is slow
2954+
* enough that the second timeout triggers before we get here. A tie
2955+
* is arbitrarily broken in favor of reporting a lock timeout.
2956+
*/
2957+
if (lock_timeout_occurred&&stmt_timeout_occurred&&
2958+
get_timeout_finish_time(STATEMENT_TIMEOUT)<get_timeout_finish_time(LOCK_TIMEOUT))
2959+
lock_timeout_occurred= false;/* report stmt timeout */
2960+
2961+
if (lock_timeout_occurred)
29462962
{
2947-
(void)get_timeout_indicator(STATEMENT_TIMEOUT, true);
29482963
LockErrorCleanup();
29492964
ereport(ERROR,
29502965
(errcode(ERRCODE_LOCK_NOT_AVAILABLE),
29512966
errmsg("canceling statement due to lock timeout")));
29522967
}
2953-
if (get_timeout_indicator(STATEMENT_TIMEOUT, true))
2968+
if (stmt_timeout_occurred)
29542969
{
29552970
LockErrorCleanup();
29562971
ereport(ERROR,

‎src/backend/utils/misc/timeout.c

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ typedef struct timeout_params
3434
timeout_handler_proctimeout_handler;
3535

3636
TimestampTzstart_time;/* time that timeout was last activated */
37-
TimestampTzfin_time;/*if active,time it is due to fire */
37+
TimestampTzfin_time;/* time it is, or was last, due to fire */
3838
}timeout_params;
3939

4040
/*
@@ -654,3 +654,17 @@ get_timeout_start_time(TimeoutId id)
654654
{
655655
returnall_timeouts[id].start_time;
656656
}
657+
658+
/*
659+
* Return the time when the timeout is, or most recently was, due to fire
660+
*
661+
* Note: will return 0 if timeout has never been activated in this process.
662+
* However, we do *not* reset the fin_time when a timeout occurs, so as
663+
* not to create a race condition if SIGALRM fires just as some code is
664+
* about to fetch the value.
665+
*/
666+
TimestampTz
667+
get_timeout_finish_time(TimeoutIdid)
668+
{
669+
returnall_timeouts[id].fin_time;
670+
}

‎src/include/utils/timeout.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,5 +80,6 @@ extern void disable_all_timeouts(bool keep_indicators);
8080
/* accessors */
8181
externboolget_timeout_indicator(TimeoutIdid,boolreset_indicator);
8282
externTimestampTzget_timeout_start_time(TimeoutIdid);
83+
externTimestampTzget_timeout_finish_time(TimeoutIdid);
8384

8485
#endif/* TIMEOUT_H */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp