- Notifications
You must be signed in to change notification settings - Fork8.2k
tools/mpy-tool.py: Allow dumping MPY segments into their own files.#17306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Open
agatti wants to merge1 commit intomicropython:masterChoose a base branch fromagatti:mpy-tool-dump-segments
base:master
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
This commit lets "tools/mpy-tool.py" extract MPY segments into their ownfiles, one file per segment.A pair of new command line arguments were added, namely "-e"/"--extract"that takes a filename prefix to use as a base for the generated files'name, and "--extract-only" that - combined with "--extract" - allowsselecting which kinds of segment should be dumped to the filesystem.So, for example, assuming there's a file called "module.mpy", running"./mpy-tool.py --extract segments module.mpy" would yield a series offiles with names like "segments_0_module.py_QSTR_module.py.bin","segments_1_module.py_META__module_.bin","segments_2_module.py_QSTR_function.bin", etc. In short the file nameformat is "<base>_<count>_<sourcefile>_<segmentkind>_<segmentname>.bin",with <segmentkind> being META, QSTR, OBJ, or CODE. Source file namesand segment names will only contain characters in the range"a-zA-Z0-9_-." to avoid having output file names with unexpectedcharacters.The "--extract-only" option can accept one or more kinds, separated bycommas and treated as case insensitive strings. The supported kindsmatch what is currently handled by the "MPYSegment" class in"tools/mpy-tool.py": "META", "QSTR", "OBJ", and "CODE". The absence ofthis command line option implies dumping every segment found.If "--extract" is passed along with "--merge", dumping is performedafter the merge process takes place, in order to dump all possiblesegments that match the requested segment kinds.Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@## master #17306 +/- ##======================================= Coverage 98.54% 98.54% ======================================= Files 169 169 Lines 21897 21897 ======================================= Hits 21579 21579 Misses 318 318 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR lets
tools/mpy-tool.py
extract MPY segments into their own files, one file per segment.This is something I wrote some time ago but I guess it cannot hurt to be upstreamed. When debugging issues related with compiled code generated by
@micropython.viper
or@micropython.native
, it is of great help being able to get hold of generated code segments to pass to objdump or ghidra/idapro/cutter/etc., without having to dump memory from gdb or writing custom file/hex dumpers.A pair of new command line arguments were added, namely "-e"/"--extract" that takes a filename prefix to use as a base for the generated files' name, and "--extract-only" that - combined with "--extract" - allows selecting which kind of segments should be dumped to the filesystem.
So, for example, assuming there's a file called "module.mpy", running "./mpy-tool.py --extract segments module.mpy" would yield a series of files with names like "segments_0_module.py_QSTR_module.py.bin", "segments_1_module.py_META__module_.bin",
"segments_2_module.py_QSTR_function.bin", etc. In short the file name format is
<base>_<count>_<sourcefile>_<segmentkind>_<segmentname>.bin
, with<segmentkind>
being META, QSTR, OBJ, or CODE. Source file names and segment names will only contain characters in the range "a-zA-Z0-9_-." to avoid having output file names with unexpected characters.The "--extract-only" option can accept one or more kinds, separated by commas and treated as case insensitive strings. The supported kinds match what is currently handled by the "MPYSegment" class in "tools/mpy-tool.py": "META", "QSTR", "OBJ", and "CODE". The absence of this command line option implies dumping every segment found.
If "--extract" is passed along with "--merge", dumping is performed after the merge process takes place, in order to dump all possible segments that match the requested segment kinds.
Testing
Besides my own usage, I've attached a zipfile containing the compiled version of
tests/micropython/native_try_deep.py
for x64 and its dumped output. To reproduce those files the commands to run are:To check that the
CODE
segments actually contain executable code, runningobjdump -b binary -M x86-64 -m i386:x86-64 --adjust-vma=0x1000 -z --start-address=0x1008 -D native_try_deep_7_native_try_deep.py_CODE_f.bin
should dump valid x64 code to STDOUT, as generated bympy-cross
(it skips the first two header words).native_try_deep.zip
Trade-offs and Alternatives
Given that this bit of code isn't executed unless explicitly required and for a niche scenario, the only issue it has would be that it increases the overall code complexity by a tiny amount and potential security issues when the output file prefix is used in a malicious way.
As far as alternatives go, I used to run
mpy-tool.py -x -d <mpyfile>
to figure out the binary code start offset by looking at the hex pairs on screen (and good luck if somebody remapped their terminal colour scheme :) no idea if the output is colourblind safe though). After a while I wrote my own cut-downmpy-tool.py
equivalent to run as a ghidra plugin, but then it would require keeping up with MPY format changes and whatnot, and I wasn't sure it would work in all possible cases.Having
mpy-tool.py
dump the segments itself is probably the best compromise for the time being, it is tool-agnostic and doesn't require anything special to get it working.