Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit3e51859

Browse files
committed
Corrections and improvements to generic parallel query documentation.
David Rowley, reviewed by Brad DeJong, Amit Kapila, and me.Discussion:http://postgr.es/m/CAKJS1f81fob-M6RJyTVv3SCasxMuQpj37ReNOJ=tprhwd7hAVg@mail.gmail.com
1 parent4d43d5d commit3e51859

File tree

1 file changed

+29
-31
lines changed

1 file changed

+29
-31
lines changed

‎doc/src/sgml/parallel.sgml

Lines changed: 29 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -275,44 +275,41 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
275275

276276
<para>
277277
The driving table may be joined to one or more other tables using nested
278-
loops or hash joins. Theouter side of the join may be any kind of
278+
loops or hash joins. Theinner side of the join may be any kind of
279279
non-parallel plan that is otherwise supported by the planner provided that
280280
it is safe to run within a parallel worker. For example, it may be an
281-
index scan which looks up a value based on a column taken from the inner
282-
table. Each worker will execute the outer side of the plan in full, which
283-
is why merge joins are not supported here. The outer side of a merge join
284-
will often involve sorting the entire inner table; even if it involves an
285-
index, it is unlikely to be productive to have multiple processes each
286-
conduct a full index scan of the inner table.
281+
index scan which looks up a value taken from the outer side of the join.
282+
Each worker will execute the inner side of the join in full, which for
283+
hash join means that an identical hash table is built in each worker
284+
process.
287285
</para>
288286
</sect2>
289287

290288
<sect2 id="parallel-aggregation">
291289
<title>Parallel Aggregation</title>
292290
<para>
293-
It is not possible to perform the aggregation portion of a query entirely
294-
in parallel. For example, if a query involves selecting
295-
<literal>COUNT(*)</>, each worker could compute a total, but those totals
296-
would need to combined in order to produce a final answer. If the query
297-
involved a <literal>GROUP BY</> clause, a separate total would need to
298-
be computed for each group. Even though aggregation can't be done entirely
299-
in parallel, queries involving aggregation are often excellent candidates
300-
for parallel query, because they typically read many rows but return only
301-
a few rows to the client. Queries that return many rows to the client
302-
are often limited by the speed at which the client can read the data,
303-
in which case parallel query cannot help very much.
304-
</para>
305-
306-
<para>
307-
<productname>PostgreSQL</> supports parallel aggregation by aggregating
308-
twice. First, each process participating in the parallel portion of the
309-
query performs an aggregation step, producing a partial result for each
310-
group of which that process is aware. This is reflected in the plan as
311-
a <literal>PartialAggregate</> node. Second, the partial results are
291+
<productname>PostgreSQL</> supports parallel aggregation by aggregating in
292+
two stages. First, each process participating in the parallel portion of
293+
the query performs an aggregation step, producing a partial result for
294+
each group of which that process is aware. This is reflected in the plan
295+
as a <literal>Partial Aggregate</> node. Second, the partial results are
312296
transferred to the leader via the <literal>Gather</> node. Finally, the
313297
leader re-aggregates the results across all workers in order to produce
314298
the final result. This is reflected in the plan as a
315-
<literal>FinalizeAggregate</> node.
299+
<literal>Finalize Aggregate</> node.
300+
</para>
301+
302+
<para>
303+
Because the <literal>Finalize Aggregate</> node runs on the leader
304+
process, queries which produce a relatively large number of groups in
305+
comparison to the number of input rows will appear less favorable to the
306+
query planner. For example, in the worst-case scenario the number of
307+
groups seen by the <literal>Finalize Aggregate</> node could be as many as
308+
the number of input rows which were seen by all worker processes in the
309+
<literal>Partial Aggregate</> stage. For such cases, there is clearly
310+
going to be no performance benefit to using parallel aggregation. The
311+
query planner takes this into account during the planning process and is
312+
unlikely to choose parallel aggregate in this scenario.
316313
</para>
317314

318315
<para>
@@ -321,10 +318,11 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
321318
have a combine function. If the aggregate has a transition state of type
322319
<literal>internal</>, it must have serialization and deserialization
323320
functions. See <xref linkend="sql-createaggregate"> for more details.
324-
Parallel aggregation is not supported for ordered set aggregates or when
325-
the query involves <literal>GROUPING SETS</>. It can only be used when
326-
all joins involved in the query are also part of the parallel portion
327-
of the plan.
321+
Parallel aggregation is not supported if any aggregate function call
322+
contains <literal>DISTINCT</> or <literal>ORDER BY</> clause and is also
323+
not supported for ordered set aggregates or when the query involves
324+
<literal>GROUPING SETS</>. It can only be used when all joins involved in
325+
the query are also part of the parallel portion of the plan.
328326
</para>
329327

330328
</sect2>

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp