Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitc9a1cc6

Browse files
committed
Change hash index creation so that rather than always establishing exactly
two buckets at the start, we create a number of buckets appropriate for theestimated size of the table. This avoids a lot of expensive bucket-splitactions during initial index build on an already-populated table.This is one of the two core ideas of Tom Raney and Shreya Bhargava's patchto reduce hash index build time. I'm committing it separately to make iteasier for people to test the effects of this separately from the effectsof their other core idea (pre-sorting the index entries by bucket number).
1 parent4873c96 commitc9a1cc6

File tree

6 files changed

+70
-29
lines changed

6 files changed

+70
-29
lines changed

‎src/backend/access/hash/README

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
$PostgreSQL: pgsql/src/backend/access/hash/README,v 1.6 2007/04/19 20:24:04 tgl Exp $
1+
$PostgreSQL: pgsql/src/backend/access/hash/README,v 1.7 2008/03/15 20:46:31 tgl Exp $
22

33
This directory contains an implementation of hash indexing for Postgres. Most
44
of the core ideas are taken from Margo Seltzer and Ozan Yigit, A New Hashing
@@ -65,6 +65,11 @@ hashm_spares[N] <= hashm_spares[N+1], since the latter count includes the
6565
former. The difference between the two represents the number of overflow
6666
pages appearing between the bucket page groups of splitpoints N and N+1.
6767

68+
(Note: the above describes what happens when filling an initially minimally
69+
sized hash index. In practice, we try to estimate the required index size
70+
and allocate a suitable number of splitpoints immediately, to avoid
71+
expensive re-splitting during initial index build.)
72+
6873
When S splitpoints exist altogether, the array entries hashm_spares[0]
6974
through hashm_spares[S] are valid; hashm_spares[S] records the current
7075
total number of overflow pages. New overflow pages are created as needed
@@ -101,9 +106,9 @@ includes the bitmap pages, which is the reason for saying that bitmap
101106
pages are a subset of the overflow pages. It turns out in fact that each
102107
bitmap page's first bit represents itself --- this is not an essential
103108
property, but falls out of the fact that we only allocate another bitmap
104-
page when we really need one. Bit number zero always corresponds toblock
105-
number 3, which is thefirst bitmap page and is allocated during index
106-
creation.
109+
page when we really need one. Bit number zero always corresponds tothe
110+
first bitmap page, which is allocated during index creation just after all
111+
the initially created buckets.
107112

108113

109114
Lock definitions

‎src/backend/access/hash/hash.c

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/hash/hash.c,v 1.98 2008/01/01 19:45:46 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/hash/hash.c,v 1.99 2008/03/15 20:46:31 tgl Exp $
1212
*
1313
* NOTES
1414
* This file contains only the public interface routines.
@@ -22,6 +22,7 @@
2222
#include"access/hash.h"
2323
#include"catalog/index.h"
2424
#include"commands/vacuum.h"
25+
#include"optimizer/plancat.h"
2526

2627

2728
/* Working state for hashbuild and its callback */
@@ -48,6 +49,7 @@ hashbuild(PG_FUNCTION_ARGS)
4849
Relationindex= (Relation)PG_GETARG_POINTER(1);
4950
IndexInfo*indexInfo= (IndexInfo*)PG_GETARG_POINTER(2);
5051
IndexBuildResult*result;
52+
BlockNumberrelpages;
5153
doublereltuples;
5254
HashBuildStatebuildstate;
5355

@@ -59,8 +61,11 @@ hashbuild(PG_FUNCTION_ARGS)
5961
elog(ERROR,"index \"%s\" already contains data",
6062
RelationGetRelationName(index));
6163

62-
/* initialize the hash index metadata page */
63-
_hash_metapinit(index);
64+
/* estimate the number of rows currently present in the table */
65+
estimate_rel_size(heap,NULL,&relpages,&reltuples);
66+
67+
/* initialize the hash index metadata page and initial buckets */
68+
_hash_metapinit(index,reltuples);
6469

6570
/* build the index */
6671
buildstate.indtuples=0;

‎src/backend/access/hash/hashpage.c

Lines changed: 44 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/hash/hashpage.c,v 1.72 2008/01/01 19:45:46 momjian Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/hash/hashpage.c,v 1.73 2008/03/15 20:46:31 tgl Exp $
1212
*
1313
* NOTES
1414
* Postgres hash pages look like ordinary relation pages. The opaque
@@ -312,15 +312,17 @@ _hash_chgbufaccess(Relation rel,
312312

313313
/*
314314
*_hash_metapinit() -- Initialize the metadata page of a hash index,
315-
*the two buckets that we begin with and the initial
316-
*bitmap page.
315+
*the initial buckets, and the initial bitmap page.
316+
*
317+
* The initial number of buckets is dependent on num_tuples, an estimate
318+
* of the number of tuples to be loaded into the index initially.
317319
*
318320
* We are fairly cavalier about locking here, since we know that no one else
319321
* could be accessing this index. In particular the rule about not holding
320322
* multiple buffer locks is ignored.
321323
*/
322324
void
323-
_hash_metapinit(Relationrel)
325+
_hash_metapinit(Relationrel,doublenum_tuples)
324326
{
325327
HashMetaPagemetap;
326328
HashPageOpaquepageopaque;
@@ -330,7 +332,10 @@ _hash_metapinit(Relation rel)
330332
int32data_width;
331333
int32item_width;
332334
int32ffactor;
333-
uint16i;
335+
doublednumbuckets;
336+
uint32num_buckets;
337+
uint32log2_num_buckets;
338+
uint32i;
334339

335340
/* safety check */
336341
if (RelationGetNumberOfBlocks(rel)!=0)
@@ -354,7 +359,26 @@ _hash_metapinit(Relation rel)
354359
ffactor=10;
355360

356361
/*
357-
* We initialize the metapage, the first two bucket pages, and the first
362+
* Choose the number of initial bucket pages to match the fill factor
363+
* given the estimated number of tuples. We round up the result to the
364+
* next power of 2, however, and always force at least 2 bucket pages.
365+
* The upper limit is determined by considerations explained in
366+
* _hash_expandtable().
367+
*/
368+
dnumbuckets=num_tuples /ffactor;
369+
if (dnumbuckets <=2.0)
370+
num_buckets=2;
371+
elseif (dnumbuckets >= (double)0x40000000)
372+
num_buckets=0x40000000;
373+
else
374+
num_buckets= ((uint32)1) <<_hash_log2((uint32)dnumbuckets);
375+
376+
log2_num_buckets=_hash_log2(num_buckets);
377+
Assert(num_buckets== (((uint32)1) <<log2_num_buckets));
378+
Assert(log2_num_buckets<HASH_MAX_SPLITPOINTS);
379+
380+
/*
381+
* We initialize the metapage, the first N bucket pages, and the first
358382
* bitmap page in sequence, using _hash_getnewbuf to cause smgrextend()
359383
* calls to occur.This ensures that the smgr level has the right idea of
360384
* the physical index length.
@@ -398,23 +422,25 @@ _hash_metapinit(Relation rel)
398422
metap->hashm_procid=index_getprocid(rel,1,HASHPROC);
399423

400424
/*
401-
* We initialize the index with two buckets, 0 and 1, occupying physical
402-
* blocks 1 and 2.The first freespace bitmap page is in block 3.
425+
* We initialize the index with N buckets, 0 .. N-1, occupying physical
426+
* blocks 1 to N. The first freespace bitmap page is in block N+1.
427+
* Since N is a power of 2, we can set the masks this way:
403428
*/
404-
metap->hashm_maxbucket=metap->hashm_lowmask=1;/* nbuckets- 1 */
405-
metap->hashm_highmask=3;/* (nbuckets << 1) - 1 */
429+
metap->hashm_maxbucket=metap->hashm_lowmask=num_buckets-1;
430+
metap->hashm_highmask=(num_buckets <<1)-1;
406431

407432
MemSet(metap->hashm_spares,0,sizeof(metap->hashm_spares));
408433
MemSet(metap->hashm_mapp,0,sizeof(metap->hashm_mapp));
409434

410-
metap->hashm_spares[1]=1;/* the first bitmap page is only spare */
411-
metap->hashm_ovflpoint=1;
435+
/* Set up mapping for one spare page after the initial splitpoints */
436+
metap->hashm_spares[log2_num_buckets]=1;
437+
metap->hashm_ovflpoint=log2_num_buckets;
412438
metap->hashm_firstfree=0;
413439

414440
/*
415-
* Initialize the firsttwo buckets
441+
* Initialize the firstN buckets
416442
*/
417-
for (i=0;i <=1;i++)
443+
for (i=0;i<num_buckets;i++)
418444
{
419445
buf=_hash_getnewbuf(rel,BUCKET_TO_BLKNO(metap,i));
420446
pg=BufferGetPage(buf);
@@ -430,7 +456,7 @@ _hash_metapinit(Relation rel)
430456
/*
431457
* Initialize first bitmap page
432458
*/
433-
_hash_initbitmap(rel,metap,3);
459+
_hash_initbitmap(rel,metap,num_buckets+1);
434460

435461
/* all done */
436462
_hash_wrtbuf(rel,metabuf);
@@ -511,6 +537,9 @@ _hash_expandtable(Relation rel, Buffer metabuf)
511537
* index with 2^32 buckets would certainly overflow BlockNumber and hence
512538
* _hash_alloc_buckets() would fail, but if we supported buckets smaller
513539
* than a disk block then this would be an independent constraint.
540+
*
541+
* If you change this, see also the maximum initial number of buckets
542+
* in _hash_metapinit().
514543
*/
515544
if (metap->hashm_maxbucket >= (uint32)0x7FFFFFFE)
516545
gotofail;

‎src/backend/optimizer/util/plancat.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
*
1010
*
1111
* IDENTIFICATION
12-
* $PostgreSQL: pgsql/src/backend/optimizer/util/plancat.c,v 1.140 2008/01/12 00:11:39 tgl Exp $
12+
* $PostgreSQL: pgsql/src/backend/optimizer/util/plancat.c,v 1.141 2008/03/15 20:46:31 tgl Exp $
1313
*
1414
*-------------------------------------------------------------------------
1515
*/
@@ -45,8 +45,6 @@ boolconstraint_exclusion = false;
4545
get_relation_info_hook_typeget_relation_info_hook=NULL;
4646

4747

48-
staticvoidestimate_rel_size(Relationrel,int32*attr_widths,
49-
BlockNumber*pages,double*tuples);
5048
staticList*get_relation_constraints(OidrelationObjectId,RelOptInfo*rel,
5149
boolinclude_notnull);
5250

@@ -319,7 +317,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
319317
* relation's attr_width[] cache; we fill this in if we have need to compute
320318
* the attribute widths for estimation purposes.
321319
*/
322-
staticvoid
320+
void
323321
estimate_rel_size(Relationrel,int32*attr_widths,
324322
BlockNumber*pages,double*tuples)
325323
{

‎src/include/access/hash.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/include/access/hash.h,v 1.84 2008/01/01 19:45:56 momjian Exp $
10+
* $PostgreSQL: pgsql/src/include/access/hash.h,v 1.85 2008/03/15 20:46:31 tgl Exp $
1111
*
1212
* NOTES
1313
*modeled after Margo Seltzer's hash implementation for unix.
@@ -298,7 +298,7 @@ extern void _hash_dropbuf(Relation rel, Buffer buf);
298298
externvoid_hash_wrtbuf(Relationrel,Bufferbuf);
299299
externvoid_hash_chgbufaccess(Relationrel,Bufferbuf,intfrom_access,
300300
intto_access);
301-
externvoid_hash_metapinit(Relationrel);
301+
externvoid_hash_metapinit(Relationrel,doublenum_tuples);
302302
externvoid_hash_pageinit(Pagepage,Sizesize);
303303
externvoid_hash_expandtable(Relationrel,Buffermetabuf);
304304

‎src/include/optimizer/plancat.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,15 @@
77
* Portions Copyright (c) 1996-2008, PostgreSQL Global Development Group
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
10-
* $PostgreSQL: pgsql/src/include/optimizer/plancat.h,v 1.47 2008/01/01 19:45:58 momjian Exp $
10+
* $PostgreSQL: pgsql/src/include/optimizer/plancat.h,v 1.48 2008/03/15 20:46:31 tgl Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
1414
#ifndefPLANCAT_H
1515
#definePLANCAT_H
1616

1717
#include"nodes/relation.h"
18+
#include"utils/rel.h"
1819

1920
/* Hook for plugins to get control in get_relation_info() */
2021
typedefvoid (*get_relation_info_hook_type) (PlannerInfo*root,
@@ -27,6 +28,9 @@ extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
2728
externvoidget_relation_info(PlannerInfo*root,OidrelationObjectId,
2829
boolinhparent,RelOptInfo*rel);
2930

31+
externvoidestimate_rel_size(Relationrel,int32*attr_widths,
32+
BlockNumber*pages,double*tuples);
33+
3034
externboolrelation_excluded_by_constraints(RelOptInfo*rel,
3135
RangeTblEntry*rte);
3236

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp