Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ports/esp32: Fix crash and improve timer interrupt allocation#17265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

DvdGiessen
Copy link
Contributor

@DvdGiessenDvdGiessen commentedMay 7, 2025
edited
Loading

Summary

esp_intr_free is not safe to call from the timer ISR because it requires thecurrent task (the one the ISR interrupted) to be pinned to the same core as the interrupt was allocated on. The ISR is of course guaranteed to interrupt a task on the same core, but it's not guaranteed that that task is also pinned to that core.

This was causing a lockup followed by ISR watchdog timeout in themachine_uart RXIDLE timer (which disables the timer from its own callback) when the ISR happened to interrupt a task that was not pinned to a specific core (for example for me if often hit the lwIP TCP/IP thread task).

The first commit in this PR fixes that by merely disabling the interrupt, whichis safe to do from the ISR since that only requires that we're currently running on the same core (which the ISR always is), regardless of the current task.

Additionally, the second commit makes repeated disabling and enabling of the interrupt (such as the UART RXIDLE does) a bit more efficient by re-enabling instead of reallocating it. That also allowed to removing some code duplication and simplify howmachine_uart uses the timer.

Testing

I've been using these fixes in combination with#17138 (which changed the PPP implementation to use that RXIDLE IRQ). Like that PR, I've been running this on three custom boards (1x original ESP32, 2x ESP32S3) for a few weeks at the moment of writing.

@DvdGiessenDvdGiessen changed the titleports/esp32: Improve timer interrupt alocports/esp32: Fix crash and improve timer interrupt allocationMay 7, 2025
@DvdGiessenDvdGiessenforce-pushed theesp32_timer_interrupt branch from2e0fb59 to2015bdaCompareMay 7, 2025 15:22
@projectgusprojectgus self-requested a reviewMay 9, 2025 06:19
Copy link
Contributor

@projectgusprojectgus left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These changes look good to me,@DvdGiessen, and even without the use-after-free bug I agree on principle that it's better to be disabling/re-enabling the interrupt rather than freeing and re-allocating it where we can.

My only concern is that it'd be nice to have either a regression test or test coverage for this. I'm guessing that's non-trivial, yeah? i.e. It doesn't sound like the crash would be easy to reproduce in a unit test, and there's no existing unit test for machine.Timer on esp32 so "extending" test coverage would really mean "implementing" test coverage which is a much bigger task. What do you think?

@DvdGiessen
Copy link
ContributorAuthor

Thanks for reviewing! Note I don't think it's a use-after-free, it's due to the use ofesp_ipc_call_blocking that it's more of a deadlock-because-the-ISR-waits-on-IPC-which-cant-run-until-the-ISR-is-finished. While triggering the bug is nondeterministic (due to it depending on which task is currently scheduled) once triggered the behaviour is deterministic.

I agree that the ESP32 could use some machine.Timer tests, but I don't think it'll be easy to test this specific bug. Due to it being dependant on scheduling (and having a non-pinned task in the first place) it won't be easily reproduced with a unit test; best we could do is either engineer the circumstances with a modified build (which seems a bit pointless) or run a reproduction in a loop with the standard build and assert it doesn't crash within some timeout.

projectgus reacted with thumbs up emoji

@projectgus
Copy link
Contributor

Note I don't think it's a use-after-free, it's due to the use ofesp_ipc_call_blocking that it's more of a deadlock-because-the-ISR-waits-on-IPC-which-cant-run-until-the-ISR-is-finished

Oh, sorry, yes you explained the issue very clearly in your description! Which I read, then waited a week, then reviewed the code so I'd forgotten what you said about it. 🤦

I think it's probably OK to merge this without worrying about automated testing, if Damien is OK with that.

@projectgusprojectgus requested a review fromdpgeorgeMay 30, 2025 06:19
@dpgeorgedpgeorge added this to therelease-1.26.0 milestoneJun 4, 2025
esp_intr_free is not safe to call from the timer ISR because it requiresthe current task (the one the ISR interrupted) to be pinned to the samecore as the interrupt was allocated on. Merely disabling the ISR however issafe since that only requires that we're currently running on the same core(which the ISR always is), regardless of the current task.This was causing deadlocks in machine_uart when the ISR happened tointerrupt a task that was not pinned to a specific core.Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>
If the interrupt is not freed but merely disabled, instead of reallocatingit every time the timer is enabled again we can instead just re-enable it.That means we're no longer setting the handler every time, and we need toensure it does not change. Doing so by adding an additional wrapperfunction does not only solve that problem, it also allows us to removesome code duplication and simplify how machine_uart uses the timer.Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>
Copy link
Member

@dpgeorgedpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks, this looks good.

Regarding tests: I'm OK not adding any new tests. We already have UART IRQ tests which should test part of the changes here. And eventually we'll have bettermachine.Timer tests that cover all ports.

@dpgeorgedpgeorgeforce-pushed theesp32_timer_interrupt branch from2015bda to2c2f0b2CompareJune 5, 2025 06:46
@dpgeorgedpgeorge merged commit2c2f0b2 intomicropython:masterJun 5, 2025
8 checks passed
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@projectgusprojectgusprojectgus approved these changes

@dpgeorgedpgeorgedpgeorge approved these changes

Assignees
No one assigned
Labels
Projects
None yet
Milestone
release-1.26.0
Development

Successfully merging this pull request may close these issues.

3 participants
@DvdGiessen@projectgus@dpgeorge

[8]ページ先頭

©2009-2025 Movatter.jp