NotificationsYou must be signed in to change notification settings
Fork33.3k
Star69.7k

GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally.#110384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

markshannon merged 11 commits intopython:mainfromfaster-cpython:tier2-deopt-part-2

Oct 23, 2023

Merged

GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally.#110384

markshannon merged 11 commits intopython:mainfromfaster-cpython:tier2-deopt-part-2

Oct 23, 2023

Conversation

Copy link

Member

markshannon commentedOct 5, 2023

This PR just provides the machinery; we still need to add support for exiting executors when theirvalid flag is falsified.

The implementation uses abloom filter.
The advantage of a bloom filter is that it requires no coupling between the executors and the objects they depend on, plus it is simpler to implement and uses less memory than a precise mapping.

I've chosen k = 6 and m = 256.
This should give a low enough false positive rate for most cases. We want to keep the false positive rate very low, as spurious de-optimizations could be expensive.

Issue:Executors might ignore instrumentation. #109369

markshannon added8 commits

September 28, 2023 16:27

Add valid flag to executor objects

dd606ae

Add ability to invalidate executors as a result of changes to other o…

45c30d3

…bjects.

Merge branch 'main' into tier2-deopt

30844bb

Remove unrelated change

79bda49

Tidier logic and add explanatory comment

76dad94

Merge branch 'main' into tier2-deopt

02201f2

Tidy up code and add test

60306d8

Remove _JUMP_IF_INVALID. Save for next PR.

0436c1a

bedevere-appbot added the awaiting review label

Oct 5, 2023

bedevere-appbot mentioned this pull request

Oct 5, 2023

Executors might ignore instrumentation.#109369

Closed

markshannon mentioned this pull request

Oct 5, 2023

GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally.#110358

Closed

markshannon added the skip news label

Oct 5, 2023

Copy link

MemberAuthor

markshannon commentedOct 5, 2023

Skipping news, as this an implementation detail

markshannon requested a review fromgvanrossum

October 13, 2023 13:44

Copy link

Member

gvanrossum commentedOct 18, 2023•
edited
Loading

Okay, let me summarize what's here so we know I understand. Then I will start a code review.

I've heard of Bloom filters and I understand their properties qualitatively (but I couldn't reproduce the math to evaluate which parameters we need).
When a code object's bytecode is modified to add instrumentation, we need to invalidate the executors that could be affected.
Because we can trace through functions, the list of executors in the code object is not enough to find all (potentially) affected executors. (Also because in theory an executor could be replaced by another while it's still running.)
Wecould solve this problem using a data structure where each code object has an list of references to executors.
(In fact, code objects can already have an array of references to executors, when they contain ENTER_EXECUTOR instructions. Though reusing this feels wrong.)
Because executors may live shorter than the code objects they depend on, we'd have to manipulate the list when an executor is finalized, which means there would have to be a list of links from the executor back to its dependent code objects.
A collection of weak references (presumably a weak set) from code objects to executors would do nicely. (We'd have to enable weak refs to executors, but that's straightforward.)
However, that's a fairly heavy-weight data structure.
Instead, we use a Bloom filter, which only occupies a fixed amount of space per executor (32 bytes, for our chosen size of 256 bits), and no space per code object. Presumably there are more code objects than executors, because only hot code gets an executor. (Are we collecting data on this?)
The results are only probabilistic, but at worst we invalidate a few executors too many, so correctness is preserved.
Now, OTOH, we have to have a data structure that lets us walk all executors. The current PR uses a doubly-linked list (so deleting an executor is quick), with a list head in the interpreter data structure (code objects and executors can't cross interpreters). There's a comment suggesting we could use something better performing (a kind of tree?) in the future. The nice thing about the doubly-linked list is that each executor is its own list node, so the cost is just two pointers per executor.

All in all I think this is a fine plan.

gvanrossum reviewed

Oct 18, 2023

View reviewed changes

Copy link

Member

gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is cool, a rare excursion to classic data structures! I'm not sure, but I worry about a bug in the linked list code when unlinking the head node promotes another node to being the head; see comment inunlink_executor().

Lib/test/test_capi/test_misc.py OutdatedShow resolvedHide resolved

Modules/_testinternalcapi.cShow resolvedHide resolved

Objects/codeobject.cShow resolvedHide resolved

Modules/_testinternalcapi.cShow resolvedHide resolved

Python/optimizer.cShow resolvedHide resolved

Python/optimizer.c OutdatedShow resolvedHide resolved

gvanrossum reviewed

Oct 19, 2023

View reviewed changes

Python/optimizer.c OutdatedShow resolvedHide resolved

markshannon added3 commits

October 23, 2023 11:55

Address some review comments

77ef658

Use a single 64 bit hash function and more clearly document how the b…

41f54b8

…loom filter works.

Fix unlinking of executors

11985ea

markshannon merged commit52e902c intopython:main

Oct 23, 2023

bedevere-appbot removed the awaiting review label

Oct 23, 2023

Copy link

Member

brandtbucher commentedOct 24, 2023•
edited
Loading

It looks like this PR introduced refleaks and assertion failures when runningtest_embed with tier two enabled:

======================================================================FAIL: test_forced_io_encoding (test.test_embed.EmbeddingTests.test_forced_io_encoding)----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 219, in test_forced_io_encoding    out, err = self.run_embedded_interpreter("test_forced_io_encoding", env=env)               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 113, in run_embedded_interpreter    self.assertEqual(p.returncode, returncode,AssertionError: -6 != 0 : bad returncode -6, stderr is "_testembed: Objects/dictobject.c:938: unicodekeys_lookup_unicode: Assertion `PyUnicode_CheckExact(ep->me_key)' failed.\n"======================================================================FAIL: test_run_main_loop (test.test_embed.EmbeddingTests.test_run_main_loop)----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 326, in test_run_main_loop    out, err = self.run_embedded_interpreter("test_run_main_loop")               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 113, in run_embedded_interpreter    self.assertEqual(p.returncode, returncode,AssertionError: -6 != 0 : bad returncode -6, stderr is "_testembed: Objects/dictobject.c:938: unicodekeys_lookup_unicode: Assertion `PyUnicode_CheckExact(ep->me_key)' failed.\n"======================================================================FAIL: test_init_is_python_build_with_home (test.test_embed.InitConfigTests.test_init_is_python_build_with_home)----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 1375, in test_init_is_python_build_with_home    self.check_all_configs("test_init_is_python_build", config,  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 794, in check_all_configs    out, err = self.run_embedded_interpreter(testname,               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 113, in run_embedded_interpreter    self.assertEqual(p.returncode, returncode,AssertionError: -6 != 0 : bad returncode -6, stderr is "_testembed: Objects/dictobject.c:938: unicodekeys_lookup_unicode: Assertion `PyUnicode_CheckExact(ep->me_key)' failed.\n"======================================================================FAIL: test_no_memleak (test.test_embed.MiscTests.test_no_memleak) (frozen_modules='off', stmt='pass')----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 1820, in test_no_memleak    self.assertEqual(refs, 0, out)AssertionError: 11 != 0 : [11 refs, 11 blocks]======================================================================FAIL: test_no_memleak (test.test_embed.MiscTests.test_no_memleak) (frozen_modules='on', stmt='pass')----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 1820, in test_no_memleak    self.assertEqual(refs, 0, out)AssertionError: 11 != 0 : [11 refs, 11 blocks]======================================================================FAIL: test_no_memleak (test.test_embed.MiscTests.test_no_memleak) (frozen_modules='off', stmt='import __hello__')----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 1820, in test_no_memleak    self.assertEqual(refs, 0, out)AssertionError: 11 != 0 : [11 refs, 11 blocks]======================================================================FAIL: test_no_memleak (test.test_embed.MiscTests.test_no_memleak) (frozen_modules='on', stmt='import __hello__')----------------------------------------------------------------------Traceback (most recent call last):  File "/home/brandtbucher/cpython/Lib/test/test_embed.py", line 1820, in test_no_memleak    self.assertEqual(refs, 0, out)AssertionError: 11 != 0 : [11 refs, 11 blocks]----------------------------------------------------------------------

I'm looking into it more now, but is there anything obvious to either of you that may be causing this?

Copy link

Member

gvanrossum commentedOct 24, 2023

Possibly the refleaks might be cured by increasing the warmup count (changing the-R parameter and the expected output correspondingly). This has happened a few times before.

No idea about the assertion failure.

brandtbucher reviewed

Oct 24, 2023

View reviewed changes

Python/optimizer.c

		void
		_Py_Executor_DependsOn(_PyExecutorObjectexecutor,voidobj)
		{
		assert(executor->vm_data.valid= true);

Copy link

Member

brandtbucherOct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Not the issue, but something I noticed while combing over this:

Suggested change

	assert(executor->vm_data.valid= true);
	assert(executor->vm_data.valid== true);

gvanrossum mentioned this pull request

Oct 25, 2023

Crashes and errors in test_embed with PYTHONUOPS=1#111339

Closed

Copy link

Member

gvanrossum commentedOct 25, 2023

See#111339 for the test_embed crashes etc.

aisk pushed a commit to aisk/cpython that referenced this pull request

Feb 11, 2024

pythonGH-109369: Add machinery for deoptimizing tier2 executors, both…

e3b4314

… individually and globally. (pythonGH-110384)

markshannon deleted the tier2-deopt-part-2 branch

August 6, 2024 10:17

Glyphack pushed a commit to Glyphack/cpython that referenced this pull request

Sep 2, 2024

pythonGH-109369: Add machinery for deoptimizing tier2 executors, both…

d1bb94d

… individually and globally. (pythonGH-110384)

Labels

skip news

Movatterモバイル変換

Uh oh!

GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally.#110384

GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally.#110384

Uh oh!

Conversation

markshannon commentedOct 5, 2023

Uh oh!

markshannon commentedOct 5, 2023

Uh oh!

gvanrossum commentedOct 18, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brandtbucher commentedOct 24, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

gvanrossum commentedOct 24, 2023

Uh oh!

brandtbucherOct 24, 2023

Choose a reason for hiding this comment

Uh oh!

gvanrossum commentedOct 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gvanrossum commentedOct 18, 2023•
edited
Loading

brandtbucher commentedOct 24, 2023•
edited
Loading