Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Reduce number of collisions in the ptrack map #5

Closed
Assignees
ololobus
Labels
enhancementNew feature or request
@ololobus

Description

@ololobus

Previously we thought that 1 MB can track changes page-to-page in the 1 GB of data files. However, recently it became evident that our ptrack map or basic hash table behaves more like a Bloom filter with a number of hash functionsk = 1. See more here:https://en.wikipedia.org/wiki/Bloom_filter#Probability_of_false_positives. Such filter has naturally more collisions.

For example, for database of size 22 GB and ptrack.map_size = 32 MB previously we thought that we will get almost zero collisions. However, theory says that we should track an extra 640 MB by tracking a 1 GB of data:

importmathdefcol_proba(n,m,k=1):return (1-math.exp(-k*n/m))**kmap_size=1024*1024# 1 MB in bytesmap_size=map_size*32# ptrack.map_size in bytesmap_pages=map_size//8# 64-bit uint or 8 bytes per slotchange_size=1000*1024# size of changes in data files in KBchange_pages=change_size//8# number of changed pages in DBdb_pages=22*1024*1024//8# 22 GB in pagesproba=col_proba(change_pages,map_pages,1)print("Collision probability: ",proba)print("Extra data (MB): ",proba*(db_pages-change_pages)*8/1024)Collisionprobability:0.030056617868655988Extradata (MB):647.0588694764261

Unfortunately this seems to be true and experiment confirms that we mark 1.6 GB of pages instead of 1 GB. Same experiment with 16 MB of map, 24 GB database and 1 GB insert: theory says there should be 1375 MB of extra blocks, in reality it appears to be ~1275 MB, which is very close.

Theoretically we can substantially reduce collisions rate by storing each page LSN into two cells (i.e.k = 2):

Collision probability:  0.003505804615154038Extra data (MB):  75.47296175503612

Let's try to do that.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions


    [8]ページ先頭

    ©2009-2025 Movatter.jp