esp_intr_free is not safe to call from the timer ISR because it requires thecurrent task (the one the ISR interrupted) to be pinned to the same core as the interrupt was allocated on. The ISR is of course guaranteed to interrupt a task on the same core, but it's not guaranteed that that task is also pinned to that core.

This was causing a lockup followed by ISR watchdog timeout in themachine_uart RXIDLE timer (which disables the timer from its own callback) when the ISR happened to interrupt a task that was not pinned to a specific core (for example for me if often hit the lwIP TCP/IP thread task).

The first commit in this PR fixes that by merely disabling the interrupt, whichis safe to do from the ISR since that only requires that we're currently running on the same core (which the ISR always is), regardless of the current task.

Additionally, the second commit makes repeated disabling and enabling of the interrupt (such as the UART RXIDLE does) a bit more efficient by re-enabling instead of reallocating it. That also allowed to removing some code duplication and simplify howmachine_uart uses the timer.

Testing

I've been using these fixes in combination with#17138 (which changed the PPP implementation to use that RXIDLE IRQ). Like that PR, I've been running this on three custom boards (1x original ESP32, 2x ESP32S3) for a few weeks at the moment of writing.

DvdGiessen changed the title~~ports/esp32: Improve timer interrupt aloc~~ports/esp32: Fix crash and improve timer interrupt allocation

May 7, 2025

DvdGiessen force-pushed theesp32_timer_interrupt branch from2e0fb59 to2015bdaCompare

May 7, 2025 15:22

projectgus self-requested a review

May 9, 2025 06:19

dpgeorge added the port-esp32 label

May 20, 2025

projectgus reviewed

May 29, 2025

View reviewed changes

Copy link

Contributor

projectgus left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These changes look good to me,@DvdGiessen, and even without the use-after-free bug I agree on principle that it's better to be disabling/re-enabling the interrupt rather than freeing and re-allocating it where we can.

My only concern is that it'd be nice to have either a regression test or test coverage for this. I'm guessing that's non-trivial, yeah? i.e. It doesn't sound like the crash would be easy to reproduce in a unit test, and there's no existing unit test for machine.Timer on esp32 so "extending" test coverage would really mean "implementing" test coverage which is a much bigger task. What do you think?

Copy link

ContributorAuthor

DvdGiessen commentedMay 29, 2025

Thanks for reviewing! Note I don't think it's a use-after-free, it's due to the use ofesp_ipc_call_blocking that it's more of a deadlock-because-the-ISR-waits-on-IPC-which-cant-run-until-the-ISR-is-finished. While triggering the bug is nondeterministic (due to it depending on which task is currently scheduled) once triggered the behaviour is deterministic.

I agree that the ESP32 could use some machine.Timer tests, but I don't think it'll be easy to test this specific bug. Due to it being dependant on scheduling (and having a non-pinned task in the first place) it won't be easily reproduced with a unit test; best we could do is either engineer the circumstances with a modified build (which seems a bit pointless) or run a reproduction in a loop with the standard build and assert it doesn't crash within some timeout.

Copy link

Contributor

projectgus commentedMay 30, 2025

Note I don't think it's a use-after-free, it's due to the use ofesp_ipc_call_blocking that it's more of a deadlock-because-the-ISR-waits-on-IPC-which-cant-run-until-the-ISR-is-finished

Oh, sorry, yes you explained the issue very clearly in your description! Which I read, then waited a week, then reviewed the code so I'd forgotten what you said about it. 🤦

I think it's probably OK to merge this without worrying about automated testing, if Damien is OK with that.

projectgus approved these changes

May 30, 2025

View reviewed changes

projectgus requested a review fromdpgeorge

May 30, 2025 06:19

dpgeorge added this to therelease-1.26.0 milestone

Jun 4, 2025

DvdGiessen added2 commits

June 5, 2025 16:39

esp32/machine_timer: Do not free interrupt from ISR.

bf90930

esp_intr_free is not safe to call from the timer ISR because it requiresthe current task (the one the ISR interrupted) to be pinned to the samecore as the interrupt was allocated on. Merely disabling the ISR however issafe since that only requires that we're currently running on the same core(which the ISR always is), regardless of the current task.This was causing deadlocks in machine_uart when the ISR happened tointerrupt a task that was not pinned to a specific core.Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>

esp32: Re-use allocated timer interrupts and simplify UART timer code.

2c2f0b2

If the interrupt is not freed but merely disabled, instead of reallocatingit every time the timer is enabled again we can instead just re-enable it.That means we're no longer setting the handler every time, and we need toensure it does not change. Doing so by adding an additional wrapperfunction does not only solve that problem, it also allows us to removesome code duplication and simplify how machine_uart uses the timer.Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>

dpgeorge approved these changes

Jun 5, 2025

View reviewed changes

Copy link

Member

dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks, this looks good.

Regarding tests: I'm OK not adding any new tests. We already have UART IRQ tests which should test part of the changes here. And eventually we'll have bettermachine.Timer tests that cover all ports.

dpgeorge force-pushed theesp32_timer_interrupt branch from2015bda to2c2f0b2Compare

June 5, 2025 06:46

dpgeorge merged commit2c2f0b2 intomicropython:master

Jun 5, 2025

8 checks passed

Labels

port-esp32

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ports/esp32: Fix crash and improve timer interrupt allocation#17265

ports/esp32: Fix crash and improve timer interrupt allocation#17265

Uh oh!

Conversation

DvdGiessen commentedMay 7, 2025•
edited
Loading