Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit1804284

Browse files
committed
Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as ParallelHash Join with Parallel Hash. While hash joins could already appear inparallel queries, they were previously always parallel-oblivious and had apartial subplan only on the outer side, meaning that the work of the innersubplan was duplicated in every worker.After this commit, the planner will consider using a partial subplan on theinner side too, using the Parallel Hash node to divide the work over theavailable CPU cores and combine its results in shared memory. If the joinneeds to be split into multiple batches in order to respect work_mem, thenworkers process different batches as much as possible and then work togetheron the remaining batches.The advantages of a parallel-aware hash join over a parallel-oblivious hashjoin used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUsOne disadvantage is that there is some communication between the participatingCPUs which might outweigh the benefits of parallelism in the case of smallhash tables. This is avoided by the planner's existing reluctance to supplypartial plans for small scans, but it may be necessary to estimatesynchronization costs in future if that situation changes. Another is thatouter batch 0 must be written to disk if multiple batches are required.A potential future advantage of parallel-aware hash joins is that right andfull outer joins could be supported, since there is a single set of matchedbits for each hashtable, but that is not yet implemented.A new GUC enable_parallel_hash is defined to control the feature, defaultingto on.Author: Thomas MunroReviewed-By: Andres Freund, Robert HaasTested-By: Rafia Sabih, Prabhat SahuDiscussion:https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.comhttps://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
1 parentf94eec4 commit1804284

File tree

30 files changed

+3091
-116
lines changed

30 files changed

+3091
-116
lines changed

‎doc/src/sgml/config.sgml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3647,6 +3647,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
36473647
</listitem>
36483648
</varlistentry>
36493649

3650+
<varlistentry id="guc-enable-parallel-hash" xreflabel="enable_parallel_hash">
3651+
<term><varname>enable_parallel_hash</varname> (<type>boolean</type>)
3652+
<indexterm>
3653+
<primary><varname>enable_parallel_hash</varname> configuration parameter</primary>
3654+
</indexterm>
3655+
</term>
3656+
<listitem>
3657+
<para>
3658+
Enables or disables the query planner's use of hash-join plan
3659+
types with parallel hash. Has no effect if hash-join plans are not
3660+
also enabled. The default is <literal>on</literal>.
3661+
</para>
3662+
</listitem>
3663+
</varlistentry>
3664+
36503665
<varlistentry id="guc-enable-partition-wise-join" xreflabel="enable_partition_wise_join">
36513666
<term><varname>enable_partition_wise_join</varname> (<type>boolean</type>)
36523667
<indexterm>

‎doc/src/sgml/monitoring.sgml

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1263,7 +1263,7 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
12631263
<entry>Waiting in an extension.</entry>
12641264
</row>
12651265
<row>
1266-
<entry morerows="17"><literal>IPC</literal></entry>
1266+
<entry morerows="32"><literal>IPC</literal></entry>
12671267
<entry><literal>BgWorkerShutdown</literal></entry>
12681268
<entry>Waiting for background worker to shut down.</entry>
12691269
</row>
@@ -1279,6 +1279,66 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
12791279
<entry><literal>ExecuteGather</literal></entry>
12801280
<entry>Waiting for activity from child process when executing <literal>Gather</literal> node.</entry>
12811281
</row>
1282+
<row>
1283+
<entry><literal>Hash/Batch/Allocating</literal></entry>
1284+
<entry>Waiting for an elected Parallel Hash participant to allocate a hash table.</entry>
1285+
</row>
1286+
<row>
1287+
<entry><literal>Hash/Batch/Electing</literal></entry>
1288+
<entry>Electing a Parallel Hash participant to allocate a hash table.</entry>
1289+
</row>
1290+
<row>
1291+
<entry><literal>Hash/Batch/Loading</literal></entry>
1292+
<entry>Waiting for other Parallel Hash participants to finish loading a hash table.</entry>
1293+
</row>
1294+
<row>
1295+
<entry><literal>Hash/Build/Allocating</literal></entry>
1296+
<entry>Waiting for an elected Parallel Hash participant to allocate the initial hash table.</entry>
1297+
</row>
1298+
<row>
1299+
<entry><literal>Hash/Build/Electing</literal></entry>
1300+
<entry>Electing a Parallel Hash participant to allocate the initial hash table.</entry>
1301+
</row>
1302+
<row>
1303+
<entry><literal>Hash/Build/HashingInner</literal></entry>
1304+
<entry>Waiting for other Parallel Hash participants to finish hashing the inner relation.</entry>
1305+
</row>
1306+
<row>
1307+
<entry><literal>Hash/Build/HashingOuter</literal></entry>
1308+
<entry>Waiting for other Parallel Hash participants to finish partitioning the outer relation.</entry>
1309+
</row>
1310+
<row>
1311+
<entry><literal>Hash/GrowBatches/Allocating</literal></entry>
1312+
<entry>Waiting for an elected Parallel Hash participant to allocate more batches.</entry>
1313+
</row>
1314+
<row>
1315+
<entry><literal>Hash/GrowBatches/Deciding</literal></entry>
1316+
<entry>Electing a Parallel Hash participant to decide on future batch growth.</entry>
1317+
</row>
1318+
<row>
1319+
<entry><literal>Hash/GrowBatches/Electing</literal></entry>
1320+
<entry>Electing a Parallel Hash participant to allocate more batches.</entry>
1321+
</row>
1322+
<row>
1323+
<entry><literal>Hash/GrowBatches/Finishing</literal></entry>
1324+
<entry>Waiting for an elected Parallel Hash participant to decide on future batch growth.</entry>
1325+
</row>
1326+
<row>
1327+
<entry><literal>Hash/GrowBatches/Repartitioning</literal></entry>
1328+
<entry>Waiting for other Parallel Hash participants to finishing repartitioning.</entry>
1329+
</row>
1330+
<row>
1331+
<entry><literal>Hash/GrowBuckets/Allocating</literal></entry>
1332+
<entry>Waiting for an elected Parallel Hash participant to finish allocating more buckets.</entry>
1333+
</row>
1334+
<row>
1335+
<entry><literal>Hash/GrowBuckets/Electing</literal></entry>
1336+
<entry>Electing a Parallel Hash participant to allocate more buckets.</entry>
1337+
</row>
1338+
<row>
1339+
<entry><literal>Hash/GrowBuckets/Reinserting</literal></entry>
1340+
<entry>Waiting for other Parallel Hash participants to finish inserting tuples into new buckets.</entry>
1341+
</row>
12821342
<row>
12831343
<entry><literal>LogicalSyncData</literal></entry>
12841344
<entry>Waiting for logical replication remote server to send data for initial table synchronization.</entry>

‎src/backend/executor/execParallel.c

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
#include"executor/nodeCustom.h"
3232
#include"executor/nodeForeignscan.h"
3333
#include"executor/nodeHash.h"
34+
#include"executor/nodeHashjoin.h"
3435
#include"executor/nodeIndexscan.h"
3536
#include"executor/nodeIndexonlyscan.h"
3637
#include"executor/nodeSeqscan.h"
@@ -266,6 +267,11 @@ ExecParallelEstimate(PlanState *planstate, ExecParallelEstimateContext *e)
266267
ExecBitmapHeapEstimate((BitmapHeapScanState*)planstate,
267268
e->pcxt);
268269
break;
270+
caseT_HashJoinState:
271+
if (planstate->plan->parallel_aware)
272+
ExecHashJoinEstimate((HashJoinState*)planstate,
273+
e->pcxt);
274+
break;
269275
caseT_HashState:
270276
/* even when not parallel-aware, for EXPLAIN ANALYZE */
271277
ExecHashEstimate((HashState*)planstate,e->pcxt);
@@ -474,6 +480,11 @@ ExecParallelInitializeDSM(PlanState *planstate,
474480
ExecBitmapHeapInitializeDSM((BitmapHeapScanState*)planstate,
475481
d->pcxt);
476482
break;
483+
caseT_HashJoinState:
484+
if (planstate->plan->parallel_aware)
485+
ExecHashJoinInitializeDSM((HashJoinState*)planstate,
486+
d->pcxt);
487+
break;
477488
caseT_HashState:
478489
/* even when not parallel-aware, for EXPLAIN ANALYZE */
479490
ExecHashInitializeDSM((HashState*)planstate,d->pcxt);
@@ -898,6 +909,11 @@ ExecParallelReInitializeDSM(PlanState *planstate,
898909
ExecBitmapHeapReInitializeDSM((BitmapHeapScanState*)planstate,
899910
pcxt);
900911
break;
912+
caseT_HashJoinState:
913+
if (planstate->plan->parallel_aware)
914+
ExecHashJoinReInitializeDSM((HashJoinState*)planstate,
915+
pcxt);
916+
break;
901917
caseT_HashState:
902918
caseT_SortState:
903919
/* these nodes have DSM state, but no reinitialization is required */
@@ -1196,6 +1212,11 @@ ExecParallelInitializeWorker(PlanState *planstate, ParallelWorkerContext *pwcxt)
11961212
ExecBitmapHeapInitializeWorker((BitmapHeapScanState*)planstate,
11971213
pwcxt);
11981214
break;
1215+
caseT_HashJoinState:
1216+
if (planstate->plan->parallel_aware)
1217+
ExecHashJoinInitializeWorker((HashJoinState*)planstate,
1218+
pwcxt);
1219+
break;
11991220
caseT_HashState:
12001221
/* even when not parallel-aware, for EXPLAIN ANALYZE */
12011222
ExecHashInitializeWorker((HashState*)planstate,pwcxt);

‎src/backend/executor/execProcnode.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -770,6 +770,9 @@ ExecShutdownNode(PlanState *node)
770770
caseT_HashState:
771771
ExecShutdownHash((HashState*)node);
772772
break;
773+
caseT_HashJoinState:
774+
ExecShutdownHashJoin((HashJoinState*)node);
775+
break;
773776
default:
774777
break;
775778
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp