
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2019-04-12 10:16 byvstinner, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 12795 | closed | vstinner,2019-04-12 10:36 | |
| PR 12796 | merged | vstinner,2019-04-12 10:39 | |
| PR 12797 | merged | vstinner,2019-04-12 10:42 | |
| PR 12801 | merged | miss-islington,2019-04-12 13:15 | |
| Messages (10) | |||
|---|---|---|---|
| msg340019 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 10:16 | |
When PYTHONMALLOC=debug environment variable or -X dev command line option is used, Python installs debug hooks on memory allocators which add 2 size_t before and 2 size_t after each memory block: it adds 32 bytes to every memory allocation.I'm debugging crashes and memory leaks in CPython for 10 years, and I simply never had to use "serialno". So I simply propose attached pull request to remove it to reduce the memory footprint: I measured a reduction around -5% (ex: 1.2 MiB on 33.0 MiB when running test_asyncio). A smaller memory footprint allows to use this feature on devices with small memory, like embedded devices.The change also fix race condition in debug memory allocators:bpo-31473, "Debug hooks on memory allocators are not thread safe (serialno variable)".Using tracemalloc, it is already possible (since Python 3.6) to find where a memory block has been allocated, and so decide where to put a breakpoint when debugging.If someone cares about the "serialno" field, maybe we can keep code using a compilation flag, like a C #define."serialno" is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." But again, I never used it...--Some examples of the *peak* memory usage without => with the change:* -c pass: 2321.8 kB => 2437.1 kB (-115.3 kiB, -5%)* -m test test_os test_sys: 14252.3 kB => 13598.6 kB (-653.7 kiB, -5%)* -m test test_asyncio: 34194.2 kB => 32963.1 kB (-1231.1 kiB, -4%)Command used to measure the memory consumption:$ ./python -i -X tracemalloc -c pass>>> import tracemalloc; print("%.1f kB" % (tracemalloc.get_traced_memory()[1] / 1024.))With the patch:diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.cindexc5d5671032..e010c2ef84 100644--- a/Modules/_tracemalloc.c+++ b/Modules/_tracemalloc.c@@ -582,6 +582,8 @@ tracemalloc_add_trace(unsigned int domain, uintptr_t ptr, _Py_hashtable_entry_t* entry; int res; + size += 4 * sizeof(size_t);+ assert(_Py_tracemalloc_config.tracing); traceback = traceback_new();Replace 4 with 3 to measure memory used with the change.--Since Python 3.6, when the debug memory allocator detects a bug (ex: buffer overflow), it now also displays the Python traceback where the memory block has been allocated if tracemalloc is tracing Python memory allocations.Example with buffer_overflow.py:---import _testcapidef func(): _testcapi.pymem_buffer_overflow()def main(): func()if __name__ == "__main__": main()---Output:---$ ./python -X tracemalloc=10 -X dev bug.pyDebug memory block at address p=0x7f45e85c3270: API 'm' 16 bytes originally requested The 7 pad bytes at p-7 are FORBIDDENBYTE, as expected. The 8 pad bytes at tail=0x7f45e85c3280 are not all FORBIDDENBYTE (0xfd): at tail+0: 0x78 *** OUCH at tail+1: 0xfd at tail+2: 0xfd at tail+3: 0xfd at tail+4: 0xfd at tail+5: 0xfd at tail+6: 0xfd at tail+7: 0xfd Data at p: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cdMemory block allocated at (most recent call first): File "bug.py", line 4 File "bug.py", line 7 File "bug.py", line 10Fatal Python error: bad trailing pad byteCurrent thread 0x00007f45f5660740 (most recent call first): File "bug.py", line 4 in func File "bug.py", line 7 in main File "bug.py", line 10 in <module>Aborted (core dumped)---The interesting part is "Memory block allocated at (most recent call first):". Traceback reconstructed manually:---Memory block allocated at (most recent call first): File "bug.py", line 4 _testcapi.pymem_buffer_overflow() File "bug.py", line 7 func() File "bug.py", line 10 main()---You can see exactly where the memory block has been allocated.Note: Internally, the _PyTraceMalloc_GetTraceback() function is used to get the traceback where a memory block has been allocated.--Extract of _PyMem_DebugRawAlloc() inObjects/obmalloc.c:/* Let S = sizeof(size_t). The debug malloc asks for 4*S extra bytes and fills them with useful stuff, here calling the underlying malloc's result p:p[0: S] Number of bytes originally asked for. This is a size_t, big-endian (easier to read in a memory dump).p[S] API ID. SeePEP 445. This is a character, but seems undocumented.p[S+1: 2*S] Copies of FORBIDDENBYTE. Used to catch under- writes and reads.p[2*S: 2*S+n] The requested memory, filled with copies of CLEANBYTE. Used to catch reference to uninitialized memory. &p[2*S] is returned. Note that this is 8-byte aligned if pymalloc handled the request itself.p[2*S+n: 2*S+n+S] Copies of FORBIDDENBYTE. Used to catch over- writes and reads.p[2*S+n+S: 2*S+n+2*S] A serial number, incremented by 1 on each call to _PyMem_DebugMalloc and _PyMem_DebugRealloc. This is a big-endian size_t. If "bad memory" is detected later, the serial number gives an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out.*//* Layout: [SSSS IFFF CCCC...CCCC FFFF NNNN] * ^--- p ^--- data ^--- tail S: nbytes stored as size_t I: API identifier (1 byte) F: Forbidden bytes (size_t - 1 bytes before, size_t bytes after) C: Clean bytes used later to store actual data N: Serial number stored as size_t */The last size_t written at the end of each memory block is "serialno". It is documented as: "an excellent way to set a breakpoint on the next run, to capture the instant at which this block was passed out." | |||
| msg340022 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 10:40 | |
I wrote 2 pull requests for the two options:*PR 12795 removes serialno field*PR 12796 adds PYMEM_DEBUG_SERIALNO define which is unset by default | |||
| msg340023 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 10:43 | |
While testing my changes, I found a bug in test_sys:./python -X tracemalloc -m test test_sys -v -m test_getallocatedblocks======================================================================ERROR: test_getallocatedblocks (test.test_sys.SysModuleTest)----------------------------------------------------------------------Traceback (most recent call last): File "/home/vstinner/prog/python/master/Lib/test/test_sys.py", line 770, in test_getallocatedblocks alloc_name = _testcapi.pymem_getallocatorsname()RuntimeError: cannot get allocators nameAttachedPR 12797 fix it. | |||
| msg340025 -(view) | Author: Inada Naoki (methane)*![]() | Date: 2019-04-12 10:52 | |
I never used the serialno too. | |||
| msg340026 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 10:59 | |
This issue is related to the following thread on python-dev which discuss disabling Py_TRACE_REFS by default,bpo-36465, to reduce the memory footprint in debug mode:https://mail.python.org/pipermail/python-dev/2019-April/157015.html | |||
| msg340027 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 11:01 | |
The serialno has been added at the same time than the whole debug hooks on Python memory allocated by Tim Peters in 2002, 17 years ago:commitddea208be9e2a8fa281e25ebbc890378dd2aa286Author: Tim Peters <tim.peters@gmail.com>Date: Sat Mar 23 10:03:50 2002 +0000 Give Python a debug-mode pymalloc, much as sketched on Python-Dev. When WITH_PYMALLOC is defined, define PYMALLOC_DEBUG to enable the debug allocator. This can be done independent of build type (release or debug). A debug build automatically defines PYMALLOC_DEBUG when pymalloc is enabled. It's a detected error to define PYMALLOC_DEBUG when pymalloc isn't enabled. Two debugging entry points defined only under PYMALLOC_DEBUG: + _PyMalloc_DebugCheckAddress(const void *p) can be used (e.g., from gdb) to sanity-check a memory block obtained from pymalloc. It sprays info to stderr (see next) and dies via Py_FatalError if the block is detectably damaged. + _PyMalloc_DebugDumpAddress(const void *p) can be used to spray info about a debug memory block to stderr. A tiny start at implementing "API family" checks isn't good for anything yet. _PyMalloc_DebugRealloc() has been optimized to do little when the new size is <= old size. However, if the new size is larger, it really can't call the underlying realloc() routine without either violating its contract, or knowing something non-trivial about how the underlying realloc() works. A memcpy is always done in this case. This was a disaster for (and only) one of the std tests: test_bufio creates single text file lines up to a million characters long. On Windows, fileobject.c's get_line() uses the horridly funky getline_via_fgets(), which keeps growing and growing a string object hoping to find a newline. It grew the string object 1000 bytes each time, so for a million-character string it took approximately forever (I gave up after a few minutes). So, also: fileobject.c, getline_via_fgets(): When a single line is outrageously long, grow the string object at a mildly exponential rate, instead of just 1000 bytes at a time. That's enough so that a debug-build test_bufio finishes in about 5 seconds on my Win98SE box. I'm curious to try this on Win2K, because it has very different memory behavior than Win9X, and test_bufio always took a factor of 10 longer to complete on Win2K. It *could* be that the endless reallocs were simply killing it on Win2K even in the release build. | |||
| msg340037 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 13:15 | |
New changeset9b8314cfe29ca532fc335277f6c36b72e6132922 by Victor Stinner in branch 'master':bpo-36611: Fix test_sys.test_getallocatedblocks() (GH-12797)https://github.com/python/cpython/commit/9b8314cfe29ca532fc335277f6c36b72e6132922 | |||
| msg340041 -(view) | Author: miss-islington (miss-islington) | Date: 2019-04-12 13:33 | |
New changeset7182e653fb5c6f78f05892b6ed302fc8db8978d3 by Miss Islington (bot) in branch '3.7':bpo-36611: Fix test_sys.test_getallocatedblocks() (GH-12797)https://github.com/python/cpython/commit/7182e653fb5c6f78f05892b6ed302fc8db8978d3 | |||
| msg340103 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 19:54 | |
New changesete8f9acf03484c6c3f163f04a76321419369c28aa by Victor Stinner in branch 'master':bpo-36611: Disable serialno field of debug memory allocators (#12796)https://github.com/python/cpython/commit/e8f9acf03484c6c3f163f04a76321419369c28aa | |||
| msg340105 -(view) | Author: STINNER Victor (vstinner)*![]() | Date: 2019-04-12 20:01 | |
We decided to only disable the code by default, but the code stays until we are sure that nobody uses it. | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:13 | admin | set | github: 80792 |
| 2019-04-12 20:01:03 | vstinner | set | status: open -> closed resolution: fixed messages: +msg340105 stage: patch review -> resolved |
| 2019-04-12 19:54:10 | vstinner | set | messages: +msg340103 |
| 2019-04-12 13:33:34 | miss-islington | set | nosy: +miss-islington messages: +msg340041 |
| 2019-04-12 13:15:18 | miss-islington | set | pull_requests: +pull_request12729 |
| 2019-04-12 13:15:09 | vstinner | set | messages: +msg340037 |
| 2019-04-12 11:01:10 | vstinner | set | nosy: +tim.peters messages: +msg340027 |
| 2019-04-12 10:59:06 | vstinner | set | messages: +msg340026 |
| 2019-04-12 10:52:01 | methane | set | nosy: +methane messages: +msg340025 |
| 2019-04-12 10:43:49 | vstinner | set | messages: +msg340023 |
| 2019-04-12 10:42:52 | vstinner | set | pull_requests: +pull_request12724 |
| 2019-04-12 10:40:24 | vstinner | set | messages: +msg340022 |
| 2019-04-12 10:39:15 | vstinner | set | pull_requests: +pull_request12723 |
| 2019-04-12 10:36:33 | vstinner | set | keywords: +patch stage: patch review pull_requests: +pull_request12722 |
| 2019-04-12 10:16:34 | vstinner | create | |