Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-96143: Allow Linux perf profiler to see Python calls#96123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
miss-islington merged 50 commits intopython:mainfrompablogsal:perf
Aug 30, 2022

Conversation

pablogsal
Copy link
Member

@pablogsalpablogsal commentedAug 19, 2022
edited
Loading

Automerge-Triggered-By: GH:pablogsal

Jongy reacted with thumbs up emojitiran, adamchainz, kumaraditya303, jjerphan, erlend-aasland, dgiger42, holmanb, maxbrunet, dalehamel, GalaxySnail, and 2 more reacted with hooray emoji
Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have no idea if that assembly code does what you say, but I've reviewed the rest.

Definitely an intriguing idea.

@pablogsalpablogsalforce-pushed theperf branch 2 times, most recently from9b009f2 to439ef28CompareAugust 19, 2022 16:51
@tiran
Copy link
Member

You can use preprocesor macros if you name the fileObjects/asm_trampoline.S or.sx instead ofObjects/asm_trampoline.s. The.sx form needs a makefile rule.

@pablogsal
Copy link
MemberAuthor

You can use preprocesor macros if you name the fileObjects/asm_trampoline.S or.sx instead ofObjects/asm_trampoline.s. The.sx form needs a makefile rule.

Hummm, not sure I follow, could you maybe show me an example of what we can achieve with this?

@tiran
Copy link
Member

You can have multiple implementations in the same file:

    .text    .globl_Py_trampoline_func_start_Py_trampoline_func_start:#ifdef __x86_64__    push   %rbp    mov    %rsp,%rbp    mov    %rdi,%rax    mov    %rsi,%rdi    mov    %rdx,%rsi    mov    %ecx,%edx    call   *%rax    pop    %rbp    ret#endif // __x86_64__#ifdef __aarch64__    TODO#endif    .globl_Py_trampoline_func_end_Py_trampoline_func_end:
pablogsal and erlend-aasland reacted with thumbs up emoji

@gpsheadgpshead added type-featureA feature request or enhancement interpreter-core(Objects, Python, Grammar, and Parser dirs) labelsAug 20, 2022
@pablogsalpablogsal marked this pull request as ready for reviewAugust 20, 2022 17:10
@pablogsalpablogsal requested a review froma team as acode ownerAugust 20, 2022 17:10
@pablogsalpablogsalforce-pushed theperf branch 3 times, most recently fromdf2a40d todda9e04CompareAugust 20, 2022 17:26
@pablogsalpablogsal changed the titleAllow Linux perf profiler to see Python callsgh-96143: Allow Linux perf profiler to see Python callsAug 20, 2022
@pablogsalpablogsalforce-pushed theperf branch 3 times, most recently fromf38dfc2 to22fc892CompareAugust 20, 2022 17:46
@pablogsalpablogsalforce-pushed theperf branch 2 times, most recently froma2f4182 to1b35ed3CompareAugust 20, 2022 18:31
@tiran
Copy link
Member

Why did you add a Windows build file? How about we do not define_PyPerfTrampoline_Init unless HAVE_PERF_TRAMPOLINE is defined?

@pablogsal
Copy link
MemberAuthor

Why did you add a Windows build file? How about we do not define_PyPerfTrampoline_Init unless HAVE_PERF_TRAMPOLINE is defined?

Then we need to add more ifdef every place is used but that works as well for sure

they now match the current code.
Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Two thoughts have been running around in my mind:

(1) We've got the command line flag and sys API, but what about just enabling this system wide or at least task/job wide within a container? An environment variable that turns it on would make that easy without plumbing options through.PYTHONPERFSUPPORT=1 as another equivalent to-Xperf? That's usually done in initconfig.c alongside the -X processing I believe.

Second: multiprocessing spawn start method. If the-Xperf flag was passed to the parent process I assume the "spawn" method should also pass that to its children.

@pablogsal
Copy link
MemberAuthor

pablogsal commentedAug 29, 2022
edited
Loading

Two thoughts have been running around in my mind:

(1) We've got the command line flag and sys API, but what about just enabling this system wide or at least task/job wide within a container? An environment variable that turns it on would make that easy without plumbing options through.PYTHONPERFSUPPORT=1 as another equivalent to-Xperf? That's usually done in initconfig.c alongside the -X processing I believe.

Second: multiprocessing spawn start method. If the-Xperf flag was passed to the parent process I assume the "spawn" method should also pass that to its children.

I think (1) makes sense and is indeed coherent with how we handle some of the other options liketracemalloc so I will add an environment variable alongside the -X option.

Regarding (2) I am not that sure. We don't do this for the rest of the flags that we pass around and you can achieve the same with the environment variable if you want. In any case, as that would require some tests I would prefer to do that in a separate PR as this is already gigantic

gpshead reacted with thumbs up emoji

@pablogsal
Copy link
MemberAuthor

pablogsal commentedAug 29, 2022
edited
Loading

I implemented the environment variable and solved a bunch of conflicts. Please check it out when you have time.

I also had to solve a bunch of conflicts.@gpshead if you are ok with the current status I would like us to land if everything looks good as the size of the PR is already attracting a bunch of merge conflicts in the build system, clinic and other files.

@pablogsal
Copy link
MemberAuthor

Ah wait, I need to document the environment variable. Will push a commit for that soon.

@pablogsal
Copy link
MemberAuthor

Ah wait, I need to document the environment variable. Will push a commit for that soon.

Done 👍

@pablogsal
Copy link
MemberAuthor

@erlend-aasland has mentioned that he was a bunch of docs improvements that he will do in a separate PR.

gpshead reacted with thumbs up emoji

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My comments are easy to address things that don't need further review - minor edits or added note/todo comments to our future selves.

Agreed that the docs can use some polishing up but all the important bits are there to seed that future work, thanks for writing them!

Thanks for taking on adding this feature! I expect we'll see how a backport fares internally.

@pablogsal
Copy link
MemberAuthor

Damn,@miss-islington has landed the PR without waiting for the CI to build the last commit I pushed so I created#96433.

@pablogsal
Copy link
MemberAuthor

Thanks, everyone for the fantastic reviews and for helping to get this feature ready ❤️

You all rock 🤘

kumaraditya303, jjerphan, pradyunsg, and erlend-aasland reacted with heart emoji

@pablogsal
Copy link
MemberAuthor

@erlend-aasland You can make the doc improvement PR after#96433 lands.

@jjerphan
Copy link
Contributor

Thanks for all for pushing for better interoperability withperf(1). I highly appreciate this contribution. 🙏

@vstinner
Copy link
Member

Nice feature!

There is a typo in sys.deactivate_stack_trampoline() docstring: "Dectivate the perf profiler trampoline": Deactivate.


In the doc,python -m sysconfig | grep HAVE_PERF_TRAMPOLINE can be replaced withpython -c "import sysconfig; print(bool(sysconfig.get_config_var('PY_HAVE_PERF_TRAMPOLINE')))" to not rely on the externalgrep command. At least, I suggest to add PY_ to fix the variable name:

python -m sysconfig | grep PY_HAVE_PERF_TRAMPOLINE

The second command can be replaced withpython -c "import sysconfig; print('no-omit-frame-pointer' in sysconfig.get_config_var('PY_CORE_CFLAGS'))".

@erlend-aasland
Copy link
Contributor

There is a typo in sys.deactivate_stack_trampoline() docstring: "Dectivate the perf profiler trampoline": Deactivate.

[...]

See#96445 :)

@rajveerb
Copy link

@pablogsal

Is it possible to port these changes, from python 3.12, without breaking to previous python versions, say 3.11, 3.10, 3.9?

Why?
Some libraries use specific versions of python and it will take time until they catch up to 3.12's release. This will defer the insights that can be gained till later on.

@pablogsal
Copy link
MemberAuthor

Is it possible to port these changes, from python 3.12, without breaking to previous python versions, say 3.11, 3.10, 3.9?

Is possible, but as these versions are only accepting security fixes or bugfixes we cannot backport new features. This means that if you want you can backport them yourself but you need to maintain a fork of the interpreter.

rajveerb reacted with thumbs up emoji

@art049art049 mentioned this pull requestApr 20, 2023
facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull requestMay 19, 2023
Summary: Backport the perf-trampoline introduced inpython/cpython#96123.  The perf trampoline doesn't work properly with the JIT, so we have submitted a PR to have a C-API to unify writing to the perf-map filespython/cpython#103546.Reviewed By: czardozDifferential Revision: D45419843fbshipit-source-id: 16bd13d7981e48c9eb7bc0e5eef1c1f4748965f6
@ZheaoliZheaoli mentioned this pull requestOct 9, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@gpsheadgpsheadgpshead approved these changes

@tirantirantiran approved these changes

@zoobazoobazooba left review comments

@tomytsai1tomytsai1tomytsai1 left review comments

@chalgggchalgggchalggg left review comments

@kumaraditya303kumaraditya303kumaraditya303 left review comments

@markshannonmarkshannonAwaiting requested review from markshannon

Assignees
No one assigned
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)type-featureA feature request or enhancement
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

14 participants
@pablogsal@tiran@bedevere-bot@gpshead@jjerphan@vstinner@erlend-aasland@rajveerb@zooba@markshannon@tomytsai1@chalggg@kumaraditya303@miss-islington

[8]ページ先頭

©2009-2025 Movatter.jp