Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Resolve issues #5 and #1: reduce number of collisions in the ptrack map#6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ololobus merged 7 commits intomasterfromdouble_slot
May 16, 2021

Conversation

ololobus
Copy link
Contributor

@ololobusololobus commentedApr 22, 2021
edited
Loading

Resolve#5
Resolve#1

…slots.Previously we thought that 1 MB can track changes page-to-page in the 1 GB ofdata files. However, recently it became evident that our ptrack map or basichash table behaves more like a Bloom filter with a number of hash functions k = 1.See more here:https://en.wikipedia.org/wiki/Bloom_filter#Probability_of_false_positives.Such filter has naturally more collisions.By storing update_lsn of each block in the additional slot we perform asa Bloom filter with k = 2, which significatly reduces collision rate.
@ololobusololobusforce-pushed thedouble_slot branch 3 times, most recently from4c32e9e to38aa439CompareApril 22, 2021 21:33
@ololobusololobus changed the titleResolve issue#5: reduce number of collisions in the ptrack mapResolve issue #5 and #1: reduce number of collisions in the ptrack mapApr 22, 2021
@ololobusololobus changed the titleResolve issue #5 and #1: reduce number of collisions in the ptrack mapResolve issues #5 and #1: reduce number of collisions in the ptrack mapApr 22, 2021
ptrack.c Outdated
Comment on lines 537 to 538
update_lsn1 = pg_atomic_read_u64(&ptrack_map->entries[slot1]);
update_lsn2 = pg_atomic_read_u64(&ptrack_map->entries[slot2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is better to fetch and check slot1 first, and only if check passed then fetch and check slot2.
This way you will save TLB and cache misses for slot2 for most of page items.
Note that compiler could not optimize/reorder atomic instructions.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

OK, I hope that I did it

FROM
(SELECT count(path) AS changed_files,
sum(
length(replace(right((pagemap)::text, -1)::varbit::text, '0', ''))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Если таблицы 8TB, то вот эта строчка потребует выделение 1GB памяти для преобразования::varbit::text.
Соответственно, таблица 16TB потребует уже 2GB памяти, и постгресс просто сам не позволит этого сделать.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Это очень грустно, что varbit не имеет функции countbits.

ololobus reacted with thumbs up emoji
Copy link
Contributor

@funny-falconfunny-falconMay 13, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

В любом случае, для ptrack_get_change_stat и ptrack_get_change_file_stat кажется нужно создать ptrack_get_pagecount (ну или другое название).
Или даже просто реализовать ptrack_get_change_file_stat полностью в сишке.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Таблицы же разбиты на сегменты по 1 ГБ дефолтно, а ptrack_get_pagemapset() выдаёт изначально битмапы per file/segment, то есть потребуется максимум в 1000 раз меньше памяти на каждое преобразование. Разве нет?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

А ок. Я ещё не посмотрел ptrack_get_pagemapset() .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Слушай, но я бы всё равно поменял бы ptrack_get_pagemapset, добавив поле count в вывод.
pg_probackup при этом не поломается, т.к. он указывает поля, которые хочет.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Сделал


/* Delete and try again */
durable_unlink(ptrack_path, LOG);
is_new_map = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Не могу найти, где делается unmap в этом случае?
При этом сразу после меткиptrack_map_reinit делаетсяdurable_unlink(ptrack_mmap_path).
В итоге, этот файл повисает невидимкой в файловой системе, и в адрессном пространстве процесса повисает его mmap.

Наверное есть смысл позвать здесьptrackCleanFilesAndMap ?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Да, похоже на то. Я сомневался в этом месте, но потом забыл и не разобрался до конца

@ololobus
Copy link
ContributorAuthor

Everything seems to be working, so I'm merging this one. If the internal QA finds out anything, we will fix it inmaster or with another PR

@ololobusololobus merged commit708c8e2 intomasterMay 16, 2021
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@funny-falconfunny-falconfunny-falcon left review comments

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Reduce number of collisions in the ptrack map Human-readable changeset
2 participants
@ololobus@funny-falcon

[8]ページ先頭

©2009-2025 Movatter.jp