Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit40c333c

Browse files
committed
Fix a performance problem in pg_dump's dump order selection logic.
findDependencyLoops() was not bright about cases where there are multipledependency paths between the same two dumpable objects. In most scenariosthis did not hurt us too badly; but since the introduction of sectionboundary pseudo-objects in commita1ef01f,it was possible for this code to take unreasonable amounts of time (tensof seconds on a database with a couple thousand objects), as reported inbug #11033 from Joe Van Dyk. Joe's particular problem scenario involved"pg_dump -a" mode with long chains of foreign key constraints, but I thinkthat similar problems could arise with other situations as long as therewere enough objects. To fix, add a flag array that lets us notice when wearrive at the same object again while searching from a given start object.This simple change seems to be enough to eliminate the performance problem.Back-patch to 9.1, like the patch that introduced section boundary objects.
1 parent3fa0c78 commit40c333c

File tree

1 file changed

+46
-10
lines changed

1 file changed

+46
-10
lines changed

‎src/bin/pg_dump/pg_dump_sort.c

Lines changed: 46 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,7 @@ static void findDependencyLoops(DumpableObject **objs, int nObjs, int totObjs);
130130
staticintfindLoop(DumpableObject*obj,
131131
DumpIdstartPoint,
132132
bool*processed,
133+
DumpId*searchFailed,
133134
DumpableObject**workspace,
134135
intdepth);
135136
staticvoidrepairDependencyLoop(DumpableObject**loop,
@@ -532,23 +533,37 @@ static void
532533
findDependencyLoops(DumpableObject**objs,intnObjs,inttotObjs)
533534
{
534535
/*
535-
* We use two data structures here. One is a bool array processed[],
536-
* which is indexed by dump ID and marks the objects already processed
537-
* during this invocation of findDependencyLoops(). The other is a
538-
* workspace[] array of DumpableObject pointers, in which we try to build
539-
* lists of objects constituting loops. We make workspace[] large enough
540-
* to hold all the objects, which is huge overkill in most cases but could
541-
* theoretically be necessary if there is a single dependency chain
542-
* linking all the objects.
536+
* We use three data structures here:
537+
*
538+
* processed[] is a bool array indexed by dump ID, marking the objects
539+
* already processed during this invocation of findDependencyLoops().
540+
*
541+
* searchFailed[] is another array indexed by dump ID. searchFailed[j] is
542+
* set to dump ID k if we have proven that there is no dependency path
543+
* leading from object j back to start point k. This allows us to skip
544+
* useless searching when there are multiple dependency paths from k to j,
545+
* which is a common situation. We could use a simple bool array for
546+
* this, but then we'd need to re-zero it for each start point, resulting
547+
* in O(N^2) zeroing work. Using the start point's dump ID as the "true"
548+
* value lets us skip clearing the array before we consider the next start
549+
* point.
550+
*
551+
* workspace[] is an array of DumpableObject pointers, in which we try to
552+
* build lists of objects constituting loops. We make workspace[] large
553+
* enough to hold all the objects in TopoSort's output, which is huge
554+
* overkill in most cases but could theoretically be necessary if there is
555+
* a single dependency chain linking all the objects.
543556
*/
544557
bool*processed;
558+
DumpId*searchFailed;
545559
DumpableObject**workspace;
546560
boolfixedloop;
547561
inti;
548562

549563
processed= (bool*)calloc(getMaxDumpId()+1,sizeof(bool));
564+
searchFailed= (DumpId*)calloc(getMaxDumpId()+1,sizeof(DumpId));
550565
workspace= (DumpableObject**)malloc(totObjs*sizeof(DumpableObject*));
551-
if (processed==NULL||workspace==NULL)
566+
if (processed==NULL||searchFailed==NULL||workspace==NULL)
552567
exit_horribly(NULL,modulename,"out of memory\n");
553568
fixedloop= false;
554569

@@ -558,7 +573,12 @@ findDependencyLoops(DumpableObject **objs, int nObjs, int totObjs)
558573
intlooplen;
559574
intj;
560575

561-
looplen=findLoop(obj,obj->dumpId,processed,workspace,0);
576+
looplen=findLoop(obj,
577+
obj->dumpId,
578+
processed,
579+
searchFailed,
580+
workspace,
581+
0);
562582

563583
if (looplen>0)
564584
{
@@ -586,6 +606,7 @@ findDependencyLoops(DumpableObject **objs, int nObjs, int totObjs)
586606
exit_horribly(NULL,modulename,"could not identify dependency loop\n");
587607

588608
free(workspace);
609+
free(searchFailed);
589610
free(processed);
590611
}
591612

@@ -596,6 +617,7 @@ findDependencyLoops(DumpableObject **objs, int nObjs, int totObjs)
596617
*obj: object we are examining now
597618
*startPoint: dumpId of starting object for the hoped-for circular loop
598619
*processed[]: flag array marking already-processed objects
620+
*searchFailed[]: flag array marking already-unsuccessfully-visited objects
599621
*workspace[]: work array in which we are building list of loop members
600622
*depth: number of valid entries in workspace[] at call
601623
*
@@ -609,6 +631,7 @@ static int
609631
findLoop(DumpableObject*obj,
610632
DumpIdstartPoint,
611633
bool*processed,
634+
DumpId*searchFailed,
612635
DumpableObject**workspace,
613636
intdepth)
614637
{
@@ -621,6 +644,13 @@ findLoop(DumpableObject *obj,
621644
if (processed[obj->dumpId])
622645
return0;
623646

647+
/*
648+
* If we've already proven there is no path from this object back to the
649+
* startPoint, forget it.
650+
*/
651+
if (searchFailed[obj->dumpId]==startPoint)
652+
return0;
653+
624654
/*
625655
* Reject if obj is already present in workspace. This test prevents us
626656
* from going into infinite recursion if we are given a startPoint object
@@ -660,12 +690,18 @@ findLoop(DumpableObject *obj,
660690
newDepth=findLoop(nextobj,
661691
startPoint,
662692
processed,
693+
searchFailed,
663694
workspace,
664695
depth);
665696
if (newDepth>0)
666697
returnnewDepth;
667698
}
668699

700+
/*
701+
* Remember there is no path from here back to startPoint
702+
*/
703+
searchFailed[obj->dumpId]=startPoint;
704+
669705
return0;
670706
}
671707

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp