NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commite09d7a1

committed

Improve speed of hash index build.

In the initial data sort, if the bucket numbers are the same thennext sort on the hash value. Because index pages are kept inhash value order, this gains a little speed by allowing theeventual tuple insertions to be done sequentially, avoiding repeateddata movement within PageAddItem. This seems to be good for overallspeedup of 5%-9%, depending on the incoming data.Simon Riggs, reviewed by Amit KapilaDiscussion:https://postgr.es/m/CANbhV-FG-1ZNMBuwhUF7AxxJz3u5137dYL-o6hchK1V_dMw86g@mail.gmail.com

1 parent70a437a commite09d7a1Copy full SHA for e09d7a1

File tree

2 files changed

+21

-5

lines changed

src/backend
- access/hash
  - hashsort.c
- utils/sort
  - tuplesortvariants.c

2 files changed

+21

-5

lines changed

`‎src/backend/access/hash/hashsort.c`

Lines changed: 4 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -42,9 +42,10 @@ struct HSpool`
`42`	`42`	`Relationindex;`
`43`	`43`
`44`	`44`	`/*`
`45`		`- * We sort the hash keys based on the buckets they belong to. Below masks`
`46`		`- * are used in _hash_hashkey2bucket to determine the bucket of given hash`
`47`		`- * key.`
	`45`	`+ * We sort the hash keys based on the buckets they belong to, then by the`
	`46`	`+ * hash values themselves, to optimize insertions onto hash pages. The`
	`47`	`+ * masks below are used in _hash_hashkey2bucket to determine the bucket of`
	`48`	`+ * a given hash key.`
`48`	`49`	`*/`
`49`	`50`	`uint32high_mask;`
`50`	`51`	`uint32low_mask;`

`‎src/backend/utils/sort/tuplesortvariants.c`

Lines changed: 17 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -1387,14 +1387,17 @@ comparetup_index_hash(const SortTuple a, const SortTuple b,`
`1387`	`1387`	`{`
`1388`	`1388`	`Bucketbucket1;`
`1389`	`1389`	`Bucketbucket2;`
	`1390`	`+uint32hash1;`
	`1391`	`+uint32hash2;`
`1390`	`1392`	`IndexTupletuple1;`
`1391`	`1393`	`IndexTupletuple2;`
`1392`	`1394`	`TuplesortPublic*base=TuplesortstateGetPublic(state);`
`1393`	`1395`	`TuplesortIndexHashArgarg= (TuplesortIndexHashArg)base->arg;`
`1394`	`1396`
`1395`	`1397`	`/*`
`1396`		`- * Fetch hash keys and mask off bits we don't want to sort by. We know`
`1397`		`- * that the first column of the index tuple is the hash key.`
	`1398`	`+ * Fetch hash keys and mask off bits we don't want to sort by, so that the`
	`1399`	`+ * initial sort is just on the bucket number. We know that the first`
	`1400`	`+ * column of the index tuple is the hash key.`
`1398`	`1401`	`*/`
`1399`	`1402`	`Assert(!a->isnull1);`
`1400`	`1403`	`bucket1=_hash_hashkey2bucket(DatumGetUInt32(a->datum1),`
`@@ -1409,6 +1412,18 @@ comparetup_index_hash(const SortTuple a, const SortTuple b,`
`1409`	`1412`	`elseif (bucket1<bucket2)`
`1410`	`1413`	`return-1;`
`1411`	`1414`
	`1415`	`+/*`
	`1416`	`+ * If bucket values are equal, sort by hash values. This allows us to`
	`1417`	`+ * insert directly onto bucket/overflow pages, where the index tuples are`
	`1418`	`+ * stored in hash order to allow fast binary search within each page.`
	`1419`	`+ */`
	`1420`	`+hash1=DatumGetUInt32(a->datum1);`
	`1421`	`+hash2=DatumGetUInt32(b->datum1);`
	`1422`	`+if (hash1>hash2)`
	`1423`	`+return1;`
	`1424`	`+elseif (hash1<hash2)`
	`1425`	`+return-1;`
	`1426`	`+`
`1412`	`1427`	`/*`
`1413`	`1428`	`* If hash values are equal, we sort on ItemPointer. This does not affect`
`1414`	`1429`	`* validity of the finished index, but it may be useful to have index`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commite09d7a1

File tree

2 files changed

2 files changed

`‎src/backend/access/hash/hashsort.c`

`‎src/backend/utils/sort/tuplesortvariants.c`

0 commit comments