Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

py/emitnative: Further Viper integer-indexed load/store improvements.#17342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
agatti wants to merge5 commits intomicropython:master
base:master
Choose a base branch
Loading
fromagatti:oh-no-more-viper-improvements

Conversation

agatti
Copy link
Contributor

@agattiagatti commentedMay 22, 2025
edited
Loading

Summary

This PR introduces improvements in the code that gets generated for integer-indexed load and store operations for various architectures.

The Viper integer-indexed generator bits are now able to utilise a specialised operation implementation if the platform emitter exposes one. This means that macros namedASM_{LOAD,STORE}{8,16,32}_REG_REG_OFFSET are used if they are present in the platform backend.

ASM_LOAD16_REG_REG_OFFSET,ASM_LOAD32_REG_REG_OFFSET, andASM_STORE32_REG_REG_OFFSET were already present since they are used by the native emitter, so their implementations are used also by the Viper emitter now. This may or may not reduce the code size, but it streamlines the current behaviour by having one single optimised implementation on each platform.

However, those three implementations are now exposed to scenarios where accessing larger than usual offsets is a somewhat probable occurrence - requiring some changes to make them work in all possible cases on Arm, Thumb, and Xtensa. The full set of integer-indexed load/store is not implemented for all Viper operations on all platforms, as the focus of this PR is ARMv7-M improvements. The rest of the changes is there to adapt the other platform backends to the infrastructure changes needed for ARMv7-M. The remaining load/store operations' code will come later in a different PR if the code size increase can be kept to a minimum.

The Arm emitter is now able to generateLDR andSTR integer-indexed opcodes whose offsets can span the full 32 bits address space. The original behaviour would be undefined if the offset was larger than 12 bits, as it would corrupt the upper bits of the generated opcode itself.

The Thumb emitter is now able to generate ARMv7-M opcodes for all data sizes if both ARMv7-M opcodes are allowed and if the opcode offset spans between 6 and 12 bits. In that case a single 32 bits opcode will be generated (LDR{B,H}.W andSTR{B,H}.W), falling back to the existing generic code sequence generator if the offset is too big to be embedded in a single opcode.

The Xtensa emitter is now able to generate opcode sequences spanning the full 32 bits address space for 32 bits loads and stores, and for 16 bit loads. The existing implementations would raise aRuntimeError if the offset could not be embedded in a single opcode.

Testing

The native and viper tests intests/micropython were ran successfully on QEMUMPS2-AN385,SABRELITE, andVIRT_RV32 boards, on a NodeMCU v3 8266 board, on a generic ESP32 D0WD board, and on x64 via the Unix port.

Trade-offs and Alternatives

I've tried to keep the extra code footprint as limited as possible without having to make the code too complex to follow (leaving some potential ARMv7-M only optimisations out as well, taking up a bit more space).

This is mostly evident for armv6/armv7 targets, where the overhead should be ~200 bytes. This was offset a bit by moving out some code from the native emitter to the platform backends, where a simpler implementation could suffice.

However there are some benefits in more optimised code being generated, and in a slight decrease of complexity in the Viper emitter as less platform-specific code is present there.

This commit extends the generic ASM API by adding the rest of theASM_{LOAD,STORE}[size]_REG_REG_OFFSET macros whenever applicable.The Viper int-indexed load/store code generator was changed to use thoseAPI functions if they are available, falling back to backend-specificimplementations if possible and ultimately to a generic implementation.Right now all backends except for x64 implement load16, load32, andstore32 operations (x64 only implements load16).Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
@codecovCodecov
Copy link

codecovbot commentedMay 22, 2025
edited
Loading

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.54%. Comparing base(49f81d5) to head(59f8f83).

Additional details and impacted files
@@            Coverage Diff             @@##           master   #17342      +/-   ##==========================================- Coverage   98.54%   98.54%   -0.01%==========================================  Files         169      169                Lines       21898    21893       -5     ==========================================- Hits        21579    21574       -5  Misses        319      319

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report?Share it here.

🚀 New features to boost your workflow:
  • ❄️Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actionsGitHub Actions
Copy link

github-actionsbot commentedMay 22, 2025
edited
Loading

Code size report:

   bare-arm:    +0 +0.000% minimal x86:    +0 +0.000%    unix x64:   -80 -0.009% standard      stm32:  +184 +0.047% PYBV10     mimxrt:  +176 +0.047% TEENSY40        rp2:  -152 -0.017% RPI_PICO_W       samd:  +188 +0.070% ADAFRUIT_ITSYBITSY_M4_EXPRESS  qemu rv32:   -58 -0.013% VIRT_RV32

This commit lets the Thumb native code generator backend emit ARMv7-Mspecific opcodes for indexed load/store operations if possible.Now T3 opcode encodings are used if the generator backend is configuredto allow emitting ARMv7-M opcodes and if the (unsigned) scaled indexfits in 12 bits.  Or, in other words, LDR{B,H}.W and STR{B,H}.W opcodesare now emitted if possible.Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit updates the existing specialised implementations forint-indexed 32-bit load and store operations, and adds a specialisedimplementation for int-indexed 16-bit load.The 32-bit operations relied on the fact that their applicability waslimited to a specific range, falling back on a generic implementationotherwise.  Introducing a single entry point for each int-indexedload/store operation size would break that assumption.  Now those twooperations contain fallback code to generate working code by themselvesinstead of raising an exception.The 16-bit operation instead simply did not have any range check, but itwas not exposed directly to the Viper emitter.  When a 16-bitint-indexed load entry point was introduced, the existing implementationwould fail when accessing memory outside its 0..255 halfwords range.  Aspecialised implementation is now present, performing fewer operationsthan the existing Viper emitter equivalent.Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit removes redundant RV32 implementations of certainint-indexed code generation operations (32-bit load/store and 16-bitload).Those operations were already available as part of the native emitterAPI but were not exposed to the Viper code generator.  As part of theintroduction of more specialised load and store API calls toint-indexed Viper load/store generator bits, the existing native emitterimplementations are reused, thus making the Viper implementationsredundant.Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit extends the range for int-indexed load/store opcodegenerators, making them emit correct code sequences for offsets thatspan more than 12 bits.This is necessary due to those generator bits being also used in theViper emitter, where it's more probable to reference offsets that cannot be embedded in the LDR/STR opcodes.Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
@agattiagattiforce-pushed theoh-no-more-viper-improvements branch fromcb22367 to59f8f83CompareMay 23, 2025 06:09
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant
@agatti

[8]ページ先頭

©2009-2025 Movatter.jp