Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit0ede57a

Browse files
committed
Corrections and improvements to generic parallel query documentation.
David Rowley, reviewed by Brad DeJong, Amit Kapila, and me.Discussion:http://postgr.es/m/CAKJS1f81fob-M6RJyTVv3SCasxMuQpj37ReNOJ=tprhwd7hAVg@mail.gmail.com
1 parentf10637e commit0ede57a

File tree

1 file changed

+29
-31
lines changed

1 file changed

+29
-31
lines changed

‎doc/src/sgml/parallel.sgml

Lines changed: 29 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -284,44 +284,41 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
284284

285285
<para>
286286
The driving table may be joined to one or more other tables using nested
287-
loops or hash joins. Theouter side of the join may be any kind of
287+
loops or hash joins. Theinner side of the join may be any kind of
288288
non-parallel plan that is otherwise supported by the planner provided that
289289
it is safe to run within a parallel worker. For example, it may be an
290-
index scan which looks up a value based on a column taken from the inner
291-
table. Each worker will execute the outer side of the plan in full, which
292-
is why merge joins are not supported here. The outer side of a merge join
293-
will often involve sorting the entire inner table; even if it involves an
294-
index, it is unlikely to be productive to have multiple processes each
295-
conduct a full index scan of the inner table.
290+
index scan which looks up a value taken from the outer side of the join.
291+
Each worker will execute the inner side of the join in full, which for
292+
hash join means that an identical hash table is built in each worker
293+
process.
296294
</para>
297295
</sect2>
298296

299297
<sect2 id="parallel-aggregation">
300298
<title>Parallel Aggregation</title>
301299
<para>
302-
It is not possible to perform the aggregation portion of a query entirely
303-
in parallel. For example, if a query involves selecting
304-
<literal>COUNT(*)</>, each worker could compute a total, but those totals
305-
would need to combined in order to produce a final answer. If the query
306-
involved a <literal>GROUP BY</> clause, a separate total would need to
307-
be computed for each group. Even though aggregation can't be done entirely
308-
in parallel, queries involving aggregation are often excellent candidates
309-
for parallel query, because they typically read many rows but return only
310-
a few rows to the client. Queries that return many rows to the client
311-
are often limited by the speed at which the client can read the data,
312-
in which case parallel query cannot help very much.
313-
</para>
314-
315-
<para>
316-
<productname>PostgreSQL</> supports parallel aggregation by aggregating
317-
twice. First, each process participating in the parallel portion of the
318-
query performs an aggregation step, producing a partial result for each
319-
group of which that process is aware. This is reflected in the plan as
320-
a <literal>PartialAggregate</> node. Second, the partial results are
300+
<productname>PostgreSQL</> supports parallel aggregation by aggregating in
301+
two stages. First, each process participating in the parallel portion of
302+
the query performs an aggregation step, producing a partial result for
303+
each group of which that process is aware. This is reflected in the plan
304+
as a <literal>Partial Aggregate</> node. Second, the partial results are
321305
transferred to the leader via the <literal>Gather</> node. Finally, the
322306
leader re-aggregates the results across all workers in order to produce
323307
the final result. This is reflected in the plan as a
324-
<literal>FinalizeAggregate</> node.
308+
<literal>Finalize Aggregate</> node.
309+
</para>
310+
311+
<para>
312+
Because the <literal>Finalize Aggregate</> node runs on the leader
313+
process, queries which produce a relatively large number of groups in
314+
comparison to the number of input rows will appear less favorable to the
315+
query planner. For example, in the worst-case scenario the number of
316+
groups seen by the <literal>Finalize Aggregate</> node could be as many as
317+
the number of input rows which were seen by all worker processes in the
318+
<literal>Partial Aggregate</> stage. For such cases, there is clearly
319+
going to be no performance benefit to using parallel aggregation. The
320+
query planner takes this into account during the planning process and is
321+
unlikely to choose parallel aggregate in this scenario.
325322
</para>
326323

327324
<para>
@@ -330,10 +327,11 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
330327
have a combine function. If the aggregate has a transition state of type
331328
<literal>internal</>, it must have serialization and deserialization
332329
functions. See <xref linkend="sql-createaggregate"> for more details.
333-
Parallel aggregation is not supported for ordered set aggregates or when
334-
the query involves <literal>GROUPING SETS</>. It can only be used when
335-
all joins involved in the query are also part of the parallel portion
336-
of the plan.
330+
Parallel aggregation is not supported if any aggregate function call
331+
contains <literal>DISTINCT</> or <literal>ORDER BY</> clause and is also
332+
not supported for ordered set aggregates or when the query involves
333+
<literal>GROUPING SETS</>. It can only be used when all joins involved in
334+
the query are also part of the parallel portion of the plan.
337335
</para>
338336

339337
</sect2>

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp