JIT: mapping bytecode instructions and assembly

Ideas

Hello,

In the past few weeks I’ve been looking at the JIT implementation a little bit more closely. As part of this investigation I wanted to “see” what assembly code was generated that matched the bytecode instruction.

Usingcapstone, I patched the_PyJIT_Compile function to print the mapping to stdout at runtime. It is something like:

...============================================_LOAD_FAST_2 {        ffff937ac218    ldr     x8, [x0, #0x58]        ffff937ac21c    ldr     w9, [x8]        ffff937ac220    adds    w9, w9, #1        ffff937ac224    b.hs    #0xffff937ac22c        ffff937ac228    str     w9, [x8]        ffff937ac22c    str     x8, [x1], #8}============================================_SET_IP {        ffff937ac230    ldr     x8, #0xffff937ac6a8        ffff937ac234    nop        ffff937ac238    str     x8, [x0, #0x38]        ffff937ac23c    b       #0xffff937ac240}============================================_GUARD_BOTH_INT {        ffff937ac240    ldr     x8, #0xffff937ac6b0        ffff937ac244    nop        ffff937ac248    ldur    x9, [x1, #-0x10]        ffff937ac24c    ldr     x9, [x9, #8]        ffff937ac250    cmp     x9, x8        ffff937ac254    b.ne    #0xffff937ac268        ffff937ac258    ldur    x9, [x1, #-8]        ffff937ac25c    ldr     x9, [x9, #8]        ffff937ac260    cmp     x9, x8        ffff937ac264    b.eq    #0xffff937ac26c        ffff937ac268    b       #0xffff937ac5c0        ffff937ac26c    b       #0xffff937ac270}

This is a quirky way to do it, but I was wondering if such functionality would be useful at all.
If we were to implement it, one possible way to do it is similarly to thedis module (or even extend thedis module).

Thoughts?

Tagging@brandtbucher for awareness.

Thanks

3 Likes

Very cool! It would definitely be useful for debugging the JIT itself, I think. Your implementation seems very similar to our use of thePYTHON_LLTRACE environment variable for debugging the tier one interpreter and tier two optimization pipeline.

This also reminds me of aPR (that I forgot about!) by@tonybaloney allowing one to obtain byte strings of the compiled code from pure-Python (at least, for those who know where to look, using secret APIs designed for tests). It looks likecapstone has aPython API that users could simply pass these strings to to obtain objects representing the assembly code… super cool! So perhaps that could be a way forward here, without introducing a hard dependency oncapstone in CPython itself.

Separately: feel free to reach out if you have any questions or comments about the JIT.

2 Likes

Take a look at the dissy app as well. I used distorm3 as the disassembler.

github.com

tonybaloney/dissy/blob/master/dissy/disassemblers/x86_64.py

try:    import distorm3except ImportError:    raise ImportError(        "distorm3 is not installed. Please install it with 'pip install distorm3'"    )from rich.text import Textfrom dissy.disassemblers.types import DisassembledImage, NativeTypeTOKEN_COLORS = {    "TOKEN_INSTRUCTION": "green",    "TOKEN_NAME": "blue",    "TOKEN_NAME_HIGHLIGHT": "cyan2",    "TOKEN_NUMBER": "magenta",    "TOKEN_DELIMITER": "white",}def disassemble(file, position=0) -> DisassembledImage:
This file has been truncated.show original

If you write the JIT dumps to disk you can load them from here. Distorm3 doesn’t support ARM64 though.

1 Like

Thanks for the pointers, very useful indeed! If there is interest then I’ll try to put together a draft PR at some point and I’ll ping you here for review/feedback.

Thanks

Anthony Shaw:

Distorm3 doesn’t support ARM64 though.

Hm, this might be a dealbreaker for me personally. I’m developing for all of the JIT’s platforms, so I’m not really eager to use different libraries on each one.

Diego Russo:

If there is interest then I’ll try to put together a draft PR at some point and I’ll ping you here for review/feedback.

I think at this point probably the only action item is for@tonybaloney to open his PR againstmain so JIT maintainers can use Python libraries likecapstone ordissy for debugging at runtime.

I’m hesitant to start adding a bunch of other infrastructure and utilities for this, since it’s an experimental feature (and I can easily count on one hand the number of people who would actually use it).

PR opened.gh-117958: Expose JIT code via access method in experimental UOpExecutor by tonybaloney · Pull Request #117959 · python/cpython · GitHub

2 Likes