Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-117755: Fix mimalloc for huge allocation on s390x#117809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
vstinner merged 2 commits intopython:mainfromvstinner:mimalloc_s390x
Apr 16, 2024

Conversation

vstinner
Copy link
Member

@vstinnervstinner commentedApr 12, 2024
edited by bedevere-appbot
Loading

Fix mimalloc allocator for huge memory allocation (around 8,589,934,592 GiB) on s390x.

  • Abort allocation early in mimalloc if the number of slices doesn't fit into uint32_t, to prevent a integer overflow (cast 64-bit size_t to uint32_t).
  • Add test_large_alloc() to test_bigaddrspace.
  • Reenable test_maxcontext_exact_arith() of test_decimal on s390x.

@vstinner
Copy link
MemberAuthor

cc@brandtbucher@colesbury@corona10

See#117755 (comment) for the rationale for this fix.

The currenthttps://github.com/microsoft/mimalloc upstream doesn't seem to have the concept of "slices" inmi_page_t anymore. I don't know if it's still affected by issuegh-117755 bug, but this change includes a regression test in test_bigaddrspace so we will catch such bug quicker. I cannot report the issue to mimalloc upstream since the code changes: there are no "slices" anymore (upstream is not affected by this exact bug anymore).

@colesbury
Copy link
Contributor

The currenthttps://github.com/microsoft/mimalloc upstream doesn't seem to have the concept of "slices" in mi_page_t anymore.

They are still there. The branches are a bit confusing.dev-slice is the development branch for mimalloc 2.x.

Copy link
Contributor

@colesburycolesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We should report and fix theuint32_t overflow, but I don't think we should be adding more tests that try to allocate huge amounts of memory. Especially with Linux's OOM killer, we are only setting ourselves up for more problems

@colesbury
Copy link
Contributor

I filed an issue upstream:microsoft/mimalloc#876

corona10 and vstinner reacted with thumbs up emoji

Fix mimalloc allocator for huge memory allocation (around8,589,934,592 GiB) on s390x.* Abort allocation early in mimalloc if the number of slices doesn't  fit into uint32_t, to prevent a integer overflow (cast 64-bit  size_t to uint32_t).* Add test_large_alloc() to test_bigaddrspace (test skipped on 32-bit  platforms).* Reenable test_maxcontext_exact_arith() of test_decimal on s390x.* Reenable test_constructor() tests of test_io on s390x.
@vstinner
Copy link
MemberAuthor

For now, I skipped the 3 test_io tests on s390x impacted by the bug:#117801 (merged). I rebased this PR on top of it to reenable these 3 test_io tests.

@vstinner
Copy link
MemberAuthor

I filed an issue upstream:microsoft/mimalloc#876

I proposed a fix upstream:microsoft/mimalloc#877

colesbury reacted with heart emoji

@vstinner
Copy link
MemberAuthor

@colesbury: Would it be acceptable to have this downstream change for now?

I suppose the mimalloc upstream cares less about Linux s390x than Python (which has Linux s390x CIs: buildbots), and it may take a few weeks to get the fix merged upstream. At least, I already proposed the fix upstream.

Copy link
Contributor

@colesburycolesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The mimalloc changes look good to me.

I'm not thrilled with thetest_large_alloc test. If mimalloc fixes the bug by making the slice count 64-bits, then we'll run into problems. It strikes me as another problematic test that makes too many assumptions and won't provide enough value in terms of catching actual bugs.

I feel the same way about thetest_constructor tests. We are skipping these tests on sanitizers, 32-bit builds, and previously on s390x. That's a sign that they are unreliable tests. We should get rid of them! Unreliable tests are worse than no test.

@vstinner
Copy link
MemberAuthor

vstinner commentedApr 16, 2024
edited
Loading

I'm not thrilled with the test_large_alloc test. If mimalloc fixes the bug by making the slice count 64-bits, then we'll run into problems.

Let me try with uint64_t... :-) In short, the Linux s390x kernel doesn't kill the process allocating an insane amount of memory. I didn't check if the process is killed later with Out Of Memory (OOM) since I tested on a shared machine.

The kernel is fine with a process allocating 822368422110644 kB of data. The strange part is "VmPTE: 26876 kB": I expected way more. I don't know which page size is used for the large allocation.

Command:

gdb -args ./python -m test test_bigaddrspace -u all -v

gdb:

(gdb) py-btTraceback (most recent call first):  File "/root/vstinner/cpython/Lib/test/test_bigaddrspace.py", line 99, in allocate    return b'x' * length  File "/root/vstinner/cpython/Lib/test/test_bigaddrspace.py", line 113, in test_large_alloc    allocate(size)(...)(gdb) where#0  0x000003fffdaa38f0 in __memset_mvcle () from /lib64/libc.so.6#1  0x00000000010ce708 in _PyBytes_Repeat (    dest=0x5bf70010030 'x' <repeats 200 times>...,     len_dest=842105263157894695, src=0x5bf31174330 "x", len_src=1)    at Objects/bytesobject.c:3640#2  0x00000000010c9e3a in bytes_repeat (a=0x5bf31174300, n=842105263157894695)    at Objects/bytesobject.c:1484(...)(gdb) frame 1(gdb) p len_dest$1 = 842105263157894695(gdb) p/x len_dest$2 = 0xbafc24672035e27

Process:

# grep ^Vm /proc/2070842/statusVmPeak:822368422334080 kBVmSize:822368422334080 kBVmLck:       0 kBVmPin:       0 kBVmHWM: 7319628 kBVmRSS: 7086188 kBVmData:822368422110644 kBVmStk:     388 kBVmExe:    6576 kBVmLib:    3624 kBVmPTE:   26876 kBVmSwap: 6529488 kB# free -m              total        used        free      shared  buff/cache   availableMem:           7782        7355         102         183         324         129Swap:          8167        6723        1444

The quantity of Swap is growing slowly... I stopped the process before most Swap was used...

vstinner added a commit to vstinner/cpython that referenced this pull requestApr 16, 2024
Remove unreliable tests on huge memory allocations:* Remove test_maxcontext_exact_arith() of test_decimal.  Stefan Krah, test author, agreed on removing the test:python#114331 (comment)* Remove test_constructor() tests of test_io.  Sam Gross suggests remove them:python#117809 (review)On Linux, depending how overcommit is configured, especially on Linuxs390x, a huge memory allocation (half or more of the full addressspace) can succeed, but then the process will eat the full systemswap and make the system slower and slower until the whole systembecomes unusable.Moreover, these tests had to be skipped when Python is built withsanitizers.
@vstinner
Copy link
MemberAuthor

I'm fine with removing these tests: I created PRgh-117938 for that.

@vstinner
Copy link
MemberAuthor

I modified this PR to restrict it to theObjects/mimalloc/segment.c change and leave tests unchanged.

Copy link
Contributor

@colesburycolesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Great! I've also added the PR to my tracking issue in#113141

vstinner reacted with thumbs up emoji
@vstinnervstinnerenabled auto-merge (squash)April 16, 2024 16:40
vstinner added a commit that referenced this pull requestApr 16, 2024
Remove unreliable tests on huge memory allocations:* Remove test_maxcontext_exact_arith() of test_decimal.  Stefan Krah, test author, agreed on removing the test:#114331 (comment)* Remove test_constructor() tests of test_io.  Sam Gross suggests remove them:#117809 (review)On Linux, depending how overcommit is configured, especially on Linuxs390x, a huge memory allocation (half or more of the full addressspace) can succeed, but then the process will eat the full systemswap and make the system slower and slower until the whole systembecomes unusable.Moreover, these tests had to be skipped when Python is built withsanitizers.
@vstinnervstinner merged commit3fe03cc intopython:mainApr 16, 2024
@vstinnervstinner deleted the mimalloc_s390x branchApril 16, 2024 20:34
diegorusso pushed a commit to diegorusso/cpython that referenced this pull requestApr 17, 2024
Remove unreliable tests on huge memory allocations:* Remove test_maxcontext_exact_arith() of test_decimal.  Stefan Krah, test author, agreed on removing the test:python#114331 (comment)* Remove test_constructor() tests of test_io.  Sam Gross suggests remove them:python#117809 (review)On Linux, depending how overcommit is configured, especially on Linuxs390x, a huge memory allocation (half or more of the full addressspace) can succeed, but then the process will eat the full systemswap and make the system slower and slower until the whole systembecomes unusable.Moreover, these tests had to be skipped when Python is built withsanitizers.
diegorusso pushed a commit to diegorusso/cpython that referenced this pull requestApr 17, 2024
…7809)Fix mimalloc allocator for huge memory allocation (around8,589,934,592 GiB) on s390x.Abort allocation early in mimalloc if the number of slices doesn'tfit into uint32_t, to prevent a integer overflow (cast 64-bitsize_t to uint32_t).
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@colesburycolesburycolesbury approved these changes

@rhettingerrhettingerAwaiting requested review from rhettinger

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@vstinner@colesbury

[8]ページ先頭

©2009-2025 Movatter.jp