NotificationsYou must be signed in to change notification settings
Fork5
Star27

Commitfc4b3de

committed

User narrower representative tuples in the hash-agg hashtable.

So far the hashtable stored representative tuples in the form of itsinput slot, with all columns in the hashtable that are notneeded (i.e. not grouped upon or functionally dependent) set to NULL.Thats good for saving memory, but it turns out that having tuples fullof NULL isn't free. slot_deform_tuple is faster if there's no NULLbitmap even if no NULLs are encountered, and skipping over leading NULLsisn't free.So compute a separate tuple descriptor that only contains the neededcolumns. As columns have already been moved in/out the slot for thehashtable that does not imply additional per-row overhead.Author: Andres FreundReviewed-By: Heikki LinnakangasDiscussion:https://postgr.es/m/20161103110721.h5i5t5saxfk5eeik@alap3.anarazel.de

1 parent8ed3f11 commitfc4b3deCopy full SHA for fc4b3de

File tree

2 files changed

+112

-57

lines changed

src
- backend/executor
  - nodeAgg.c
- include/nodes
  - execnodes.h

2 files changed

+112

-57

lines changed

`‎src/backend/executor/nodeAgg.c‎`

Lines changed: 108 additions & 56 deletions

Original file line number	Diff line number	Diff line change
`@@ -1717,7 +1717,7 @@ build_hash_table(AggState *aggstate)`
`1717`	`1717`	`additionalsize=aggstate->numaggs*sizeof(AggStatePerGroupData);`
`1718`	`1718`
`1719`	`1719`	`aggstate->hashtable=BuildTupleHashTable(node->numCols,`
`1720`		`-node->grpColIdx,`
	`1720`	`+aggstate->hashGrpColIdxHash,`
`1721`	`1721`	`aggstate->phase->eqfunctions,`
`1722`	`1722`	`aggstate->hashfunctions,`
`1723`	`1723`	`node->numGroups,`
`@@ -1727,47 +1727,88 @@ build_hash_table(AggState *aggstate)`
`1727`	`1727`	`}`
`1728`	`1728`
`1729`	`1729`	`/*`
`1730`		`- * Create a list of the tuple columns that actually need to be stored in`
`1731`		`- * hashtable entries. The incoming tuples from the child plan node will`
`1732`		`- * contain grouping columns, other columns referenced in our targetlist and`
`1733`		`- * qual, columns used to compute the aggregate functions, and perhaps just`
`1734`		`- * junk columns we don't use at all. Only columns of the first two types`
`1735`		`- * need to be stored in the hashtable, and getting rid of the others can`
`1736`		`- * make the table entries significantly smaller. To avoid messing up Var`
`1737`		`- * numbering, we keep the same tuple descriptor for hashtable entries as the`
`1738`		`- * incoming tuples have, but set unwanted columns to NULL in the tuples that`
`1739`		`- * go into the table.`
`1740`		`- *`
`1741`		`- * To eliminate duplicates, we build a bitmapset of the needed columns, then`
`1742`		`- * convert it to an integer list (cheaper to scan at runtime). The list is`
`1743`		`- * in decreasing order so that the first entry is the largest;`
`1744`		`- * lookup_hash_entry depends on this to use slot_getsomeattrs correctly.`
`1745`		`- * Note that the list is preserved over ExecReScanAgg, so we allocate it in`
`1746`		`- * the per-query context (unlike the hash table itself).`
`1747`		`- *`
`1748`		`- * Note: at present, searching the tlist/qual is not really necessary since`
`1749`		`- * the parser should disallow any unaggregated references to ungrouped`
`1750`		`- * columns. However, the search will be needed when we add support for`
`1751`		`- * SQL99 semantics that allow use of "functionally dependent" columns that`
`1752`		`- * haven't been explicitly grouped by.`
	`1730`	`+ * Compute columns that actually need to be stored in hashtable entries. The`
	`1731`	`+ * incoming tuples from the child plan node will contain grouping columns,`
	`1732`	`+ * other columns referenced in our targetlist and qual, columns used to`
	`1733`	`+ * compute the aggregate functions, and perhaps just junk columns we don't use`
	`1734`	`+ * at all. Only columns of the first two types need to be stored in the`
	`1735`	`+ * hashtable, and getting rid of the others can make the table entries`
	`1736`	`+ * significantly smaller. The hashtable only contains the relevant columns,`
	`1737`	`+ * and is packed/unpacked in lookup_hash_entry() / agg_retrieve_hash_table()`
	`1738`	`+ * into the format of the normal input descriptor.`
	`1739`	`+ *`
	`1740`	`+ * Additional columns, in addition to the columns grouped by, come from two`
	`1741`	`+ * sources: Firstly functionally dependent columns that we don't need to group`
	`1742`	`+ * by themselves, and secondly ctids for row-marks.`
	`1743`	`+ *`
	`1744`	`+ * To eliminate duplicates, we build a bitmapset of the needed columns, and`
	`1745`	`+ * then build an array of the columns included in the hashtable. Note that`
	`1746`	`+ * the array is preserved over ExecReScanAgg, so we allocate it in the`
	`1747`	`+ * per-query context (unlike the hash table itself).`
`1753`	`1748`	`*/`
`1754`	`1749`	`staticList*`
`1755`	`1750`	`find_hash_columns(AggState*aggstate)`
`1756`	`1751`	`{`
`1757`	`1752`	`Aggnode= (Agg)aggstate->ss.ps.plan;`
`1758`	`1753`	`Bitmapset*colnos;`
`1759`	`1754`	`List*collist;`
	`1755`	`+TupleDeschashDesc;`
	`1756`	`+List*outerTlist=outerPlanState(aggstate)->plan->targetlist;`
	`1757`	`+List*hashTlist=NIL;`
`1760`	`1758`	`inti;`
`1761`	`1759`
	`1760`	`+aggstate->largestGrpColIdx=0;`
	`1761`	`+`
`1762`	`1762`	`/* Find Vars that will be needed in tlist and qual */`
`1763`	`1763`	`colnos=find_unaggregated_cols(aggstate);`
`1764`	`1764`	`/* Add in all the grouping columns */`
`1765`	`1765`	`for (i=0;i<node->numCols;i++)`
`1766`	`1766`	`colnos=bms_add_member(colnos,node->grpColIdx[i]);`
`1767`	`1767`	`/* Convert to list, using lcons so largest element ends up first */`
`1768`	`1768`	`collist=NIL;`
	`1769`	`+`
	`1770`	`+aggstate->hashGrpColIdxInput=`
	`1771`	`+palloc(bms_num_members(colnos)*sizeof(AttrNumber));`
	`1772`	`+aggstate->hashGrpColIdxHash=`
	`1773`	`+palloc(node->numCols*sizeof(AttrNumber));`
	`1774`	`+`
	`1775`	`+/*`
	`1776`	`+ * First build mapping for columns directly hashed. These are the first,`
	`1777`	`+ * because they'll be accessed when computing hash values and comparing`
	`1778`	`+ * tuples for exact matches. We also build simple mapping for`
	`1779`	`+ * execGrouping, so it knows where to find the to-be-hashed / compared`
	`1780`	`+ * columns in the input.`
	`1781`	`+ */`
	`1782`	`+for (i=0;i<node->numCols;i++)`
	`1783`	`+{`
	`1784`	`+aggstate->hashGrpColIdxInput[i]=node->grpColIdx[i];`
	`1785`	`+aggstate->hashGrpColIdxHash[i]=i+1;`
	`1786`	`+aggstate->numhashGrpCols++;`
	`1787`	`+/* delete already mapped columns */`
	`1788`	`+bms_del_member(colnos,node->grpColIdx[i]);`
	`1789`	`+}`
	`1790`	`+`
	`1791`	`+/* and add the remaining columns */`
`1769`	`1792`	`while ((i=bms_first_member(colnos)) >=0)`
`1770`		`-collist=lcons_int(i,collist);`
	`1793`	`+{`
	`1794`	`+aggstate->hashGrpColIdxInput[aggstate->numhashGrpCols]=i;`
	`1795`	`+aggstate->numhashGrpCols++;`
	`1796`	`+}`
	`1797`	`+`
	`1798`	`+/* and build a tuple descriptor for the hashtable */`
	`1799`	`+for (i=0;i<aggstate->numhashGrpCols;i++)`
	`1800`	`+{`
	`1801`	`+intvarNumber=aggstate->hashGrpColIdxInput[i]-1;`
	`1802`	`+`
	`1803`	`+hashTlist=lappend(hashTlist,list_nth(outerTlist,varNumber));`
	`1804`	`+aggstate->largestGrpColIdx=`
	`1805`	`+Max(varNumber+1,aggstate->largestGrpColIdx);`
	`1806`	`+}`
	`1807`	`+`
	`1808`	`+hashDesc=ExecTypeFromTL(hashTlist, false);`
	`1809`	`+ExecSetSlotDescriptor(aggstate->hashslot,hashDesc);`
	`1810`	`+`
	`1811`	`+list_free(hashTlist);`
`1771`	`1812`	`bms_free(colnos);`
`1772`	`1813`
`1773`	`1814`	`returncollist;`
`@@ -1804,27 +1845,22 @@ static TupleHashEntryData *`
`1804`	`1845`	`lookup_hash_entry(AggStateaggstate,TupleTableSlotinputslot)`
`1805`	`1846`	`{`
`1806`	`1847`	`TupleTableSlot*hashslot=aggstate->hashslot;`
`1807`		`-ListCell*l;`
`1808`	`1848`	`TupleHashEntryData*entry;`
`1809`	`1849`	`boolisnew;`
`1810`		`-`
`1811`		`-/* if first time through, initialize hashslot by cloning input slot */`
`1812`		`-if (hashslot->tts_tupleDescriptor==NULL)`
`1813`		`-{`
`1814`		`-ExecSetSlotDescriptor(hashslot,inputslot->tts_tupleDescriptor);`
`1815`		`-/* Make sure all unused columns are NULLs */`
`1816`		`-ExecStoreAllNullTuple(hashslot);`
`1817`		`-}`
	`1850`	`+inti;`
`1818`	`1851`
`1819`	`1852`	`/* transfer just the needed columns into hashslot */`
`1820`		`-slot_getsomeattrs(inputslot,linitial_int(aggstate->hash_needed));`
`1821`		`-foreach(l,aggstate->hash_needed)`
	`1853`	`+slot_getsomeattrs(inputslot,aggstate->largestGrpColIdx);`
	`1854`	`+ExecClearTuple(hashslot);`
	`1855`	`+`
	`1856`	`+for (i=0;i<aggstate->numhashGrpCols;i++)`
`1822`	`1857`	`{`
`1823`		`-intvarNumber=lfirst_int(l)-1;`
	`1858`	`+intvarNumber=aggstate->hashGrpColIdxInput[i]-1;`
`1824`	`1859`
`1825`		`-hashslot->tts_values[varNumber]=inputslot->tts_values[varNumber];`
`1826`		`-hashslot->tts_isnull[varNumber]=inputslot->tts_isnull[varNumber];`
	`1860`	`+hashslot->tts_values[i]=inputslot->tts_values[varNumber];`
	`1861`	`+hashslot->tts_isnull[i]=inputslot->tts_isnull[varNumber];`
`1827`	`1862`	`}`
	`1863`	`+ExecStoreVirtualTuple(hashslot);`
`1828`	`1864`
`1829`	`1865`	`/* find or create the hashtable entry using the filtered tuple */`
`1830`	`1866`	`entry=LookupTupleHashEntry(aggstate->hashtable,hashslot,&isnew);`
`@@ -2286,6 +2322,7 @@ agg_retrieve_hash_table(AggState *aggstate)`
`2286`	`2322`	`TupleHashEntryData*entry;`
`2287`	`2323`	`TupleTableSlot*firstSlot;`
`2288`	`2324`	`TupleTableSlot*result;`
	`2325`	`+TupleTableSlot*hashslot;`
`2289`	`2326`
`2290`	`2327`	`/*`
`2291`	`2328`	`* get state info from node`
`@@ -2294,13 +2331,17 @@ agg_retrieve_hash_table(AggState *aggstate)`
`2294`	`2331`	`econtext=aggstate->ss.ps.ps_ExprContext;`
`2295`	`2332`	`peragg=aggstate->peragg;`
`2296`	`2333`	`firstSlot=aggstate->ss.ss_ScanTupleSlot;`
	`2334`	`+hashslot=aggstate->hashslot;`
	`2335`	`+`
`2297`	`2336`
`2298`	`2337`	`/*`
`2299`	`2338`	`* We loop retrieving groups until we find one satisfying`
`2300`	`2339`	`* aggstate->ss.ps.qual`
`2301`	`2340`	`*/`
`2302`	`2341`	`while (!aggstate->agg_done)`
`2303`	`2342`	`{`
	`2343`	`+inti;`
	`2344`	`+`
`2304`	`2345`	`/*`
`2305`	`2346`	`* Find the next entry in the hash table`
`2306`	`2347`	`*/`
`@@ -2322,12 +2363,24 @@ agg_retrieve_hash_table(AggState *aggstate)`
`2322`	`2363`	`ResetExprContext(econtext);`
`2323`	`2364`
`2324`	`2365`	`/*`
`2325`		`- *Store the copied first input tuple in thetuple table slot reserved`
`2326`		`- *for it, so that it can be used in ExecProject.`
	`2366`	`+ *Transform representative tuple back into one with theright`
	`2367`	`+ *columns.`
`2327`	`2368`	`*/`
`2328`		`-ExecStoreMinimalTuple(entry->firstTuple,`
`2329`		`-firstSlot,`
`2330`		`- false);`
	`2369`	`+ExecStoreMinimalTuple(entry->firstTuple,hashslot, false);`
	`2370`	`+slot_getallattrs(hashslot);`
	`2371`	`+`
	`2372`	`+ExecClearTuple(firstSlot);`
	`2373`	`+memset(firstSlot->tts_isnull, true,`
	`2374`	`+firstSlot->tts_tupleDescriptor->natts*sizeof(bool));`
	`2375`	`+`
	`2376`	`+for (i=0;i<aggstate->numhashGrpCols;i++)`
	`2377`	`+{`
	`2378`	`+intvarNumber=aggstate->hashGrpColIdxInput[i]-1;`
	`2379`	`+`
	`2380`	`+firstSlot->tts_values[varNumber]=hashslot->tts_values[i];`
	`2381`	`+firstSlot->tts_isnull[varNumber]=hashslot->tts_isnull[i];`
	`2382`	`+}`
	`2383`	`+ExecStoreVirtualTuple(firstSlot);`
`2331`	`2384`
`2332`	`2385`	`pergroup= (AggStatePerGroup)entry->additional;`
`2333`	`2386`
`@@ -2604,16 +2657,6 @@ ExecInitAgg(Agg node, EState estate, int eflags)`
`2604`	`2657`	`while ((i=bms_next_member(all_grouped_cols,i)) >=0)`
`2605`	`2658`	`aggstate->all_grouped_cols=lcons_int(i,aggstate->all_grouped_cols);`
`2606`	`2659`
`2607`		`-/*`
`2608`		`- * Hashing can only appear in the initial phase.`
`2609`		`- */`
`2610`		`-`
`2611`		`-if (node->aggstrategy==AGG_HASHED)`
`2612`		`-execTuplesHashPrepare(node->numCols,`
`2613`		`-node->grpOperators,`
`2614`		`-&aggstate->phases[0].eqfunctions,`
`2615`		`-&aggstate->hashfunctions);`
`2616`		`-`
`2617`	`2660`	`/*`
`2618`	`2661`	`* Initialize current phase-dependent values to initial phase`
`2619`	`2662`	`*/`
`@@ -2635,12 +2678,21 @@ ExecInitAgg(Agg node, EState estate, int eflags)`
`2635`	`2678`	`aggstate->peragg=peraggs;`
`2636`	`2679`	`aggstate->pertrans=pertransstates;`
`2637`	`2680`
	`2681`	`+`
	`2682`	`+/*`
	`2683`	`+ * Hashing can only appear in the initial phase.`
	`2684`	`+ */`
`2638`	`2685`	`if (node->aggstrategy==AGG_HASHED)`
`2639`	`2686`	`{`
	`2687`	`+find_hash_columns(aggstate);`
	`2688`	`+`
	`2689`	`+execTuplesHashPrepare(node->numCols,`
	`2690`	`+node->grpOperators,`
	`2691`	`+&aggstate->phases[0].eqfunctions,`
	`2692`	`+&aggstate->hashfunctions);`
	`2693`	`+`
`2640`	`2694`	`build_hash_table(aggstate);`
`2641`	`2695`	`aggstate->table_filled= false;`
`2642`		`-/* Compute the columns we actually need to hash on */`
`2643`		`-aggstate->hash_needed=find_hash_columns(aggstate);`
`2644`	`2696`	`}`
`2645`	`2697`	`else`
`2646`	`2698`	`{`

`‎src/include/nodes/execnodes.h‎`

Lines changed: 4 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1860,7 +1860,10 @@ typedef struct AggState`
`1860`	`1860`	`/* these fields are used in AGG_HASHED mode: */`
`1861`	`1861`	`TupleHashTablehashtable;/* hash table with one entry per group */`
`1862`	`1862`	`TupleTableSlothashslot;/ slot for loading hash table */`
`1863`		`-Listhash_needed;/ list of columns needed in hash table */`
	`1863`	`+intnumhashGrpCols;/* number of columns in hash table */`
	`1864`	`+intlargestGrpColIdx;/* largest column required for hashing */`
	`1865`	`+AttrNumberhashGrpColIdxInput;/ and their indices in input slot */`
	`1866`	`+AttrNumberhashGrpColIdxHash;/ indices for execGrouping in hashtbl */`
`1864`	`1867`	`booltable_filled;/* hash table filled yet? */`
`1865`	`1868`	`TupleHashIteratorhashiter;/* for iterating through hash table */`
`1866`	`1869`	`/* support for evaluation of agg inputs */`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitfc4b3de

File tree

2 files changed

2 files changed

`‎src/backend/executor/nodeAgg.c‎`

`‎src/include/nodes/execnodes.h‎`

0 commit comments