Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

mpremote: Add automatic PTY device detection for QEMU#18327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
andrewleech wants to merge2 commits intomicropython:master
base:master
Choose a base branch
Loading
fromandrewleech:mpremote_pty

Conversation

@andrewleech
Copy link
Contributor

@andrewleechandrewleech commentedOct 25, 2025
edited
Loading

tools: Add automatic PTY device detection for QEMU

Summary

Bothmpremote andpyboard.py can hang when connecting to PTY devices (QEMU serial output) due to PTYinWaiting() timing behavior. While the official test suite usually passes (short tests + timing workarounds mask the issue), long-running operations and heavy serial traffic expose intermittent hangs. This adds automatic PTY detection to both tools and uses blocking reads for PTY devices while maintaining non-blocking behavior for real serial devices.

Impact:

  • Eliminates intermittent hangs in QEMU-based testing workflows
  • Improves official test suite reliability (removes timing-dependent failures)
  • Enables long-running QEMU tests without manual patches
  • Consistent behavior across both serial communication tools

Problem

PTY (pseudo-terminal) devices used by QEMU don't reliably reportinWaiting() status. Since commit0d46e45a1 switched to non-blocking reads, mpremote can fail on PTY devices:

serial.serialutil.SerialException: device reports readiness to read but returned no data

Why it fails (when it does):

  1. PTYinWaiting() behavior depends on timing - when it's called relative to when data arrives
  2. mpremote waits forinWaiting() > 0 before reading
  3. If the timing is unfavorable, the connection times out spinning in the polling loop

The timing-dependent nature:

  • PTYinWaiting() behavior depends on OS scheduling and data arrival timing
  • Sometimes data arrives fast enough that the polling loop catches it (tests pass)
  • Sometimes the process scheduler is favorable (tests pass)
  • Long-running operations increase probability of hitting the timing window whereinWaiting() returns 0 while data is actually buffered (tests fail)

Current workarounds:

  • Users must manually patch mpremote/pyboard.py for reliable QEMU testing
  • Timing delays (0.1s sleep, commit0950f65) reduce but don't eliminate failures

Why Official Tests Usually Pass

The official MicroPython test suite runs against QEMU (usingpyboard.py from line 369 oftests/run-tests.py) and generally passes in CI, despite this underlying PTY issue. Several factors mask the problem:

  1. Short test duration: Most tests run <1s, limiting exposure to the timing window
  2. Timing workaround: Commit0950f65 added a 0.1s sleep after PTY detection, reducing (but not eliminating) race conditions
  3. Favorable conditions: Quick tests with minimal serial traffic often complete before hitting the timing issue
  4. Dismissed failures: Occasional CI failures likely attributed to "QEMU flaky" rather than investigated

What exposes the bug reliably:

  • Long-running operations: OpenMV's 180s test suite (81 sequential tests with image processing)
  • Heavy serial traffic: Image data transfer, continuous operations
  • Production workloads: Real-world usage patterns vs. minimal test cases

The fix eliminates the timing dependency entirely by using blocking reads (which pyserial handles correctly for PTYs) instead of pollinginWaiting().

Root Cause Technical Details

PTY devices (Unix98 pseudo-terminals at/dev/pts/N) have timing-dependent I/O behavior:

MethodReal SerialPTY Device
inWaiting()Pending byte countTiming-dependent (can return 0 even with data)
read(1) blockingBlocks until timeoutReturns immediately with data
select()AccurateAccurate

The race condition:

Time  | QEMU/Target        | mpremote (stock)------|-------------------|------------------T0    | Sends data        |T1    |                   | Calls inWaiting()T2    | Data buffered     | Returns 0 (race!)T3    |                   | sleep(0.01)T4    |                   | Calls inWaiting() again...   | (repeat)          | (spinning)

Stock mpremote code path:

whileTrue:ifdata.endswith(ending):breakelifself.serial.inWaiting()>0:# PTY: Timing-dependent!new_data=self.serial.read(1)# Process dataelse:time.sleep(0.01)# Can spin here if race condition hits

The code checksinWaiting() before reading, which is unreliable for PTY devices due to timing.

Solution

Auto-detect PTY devices and use appropriate read strategy:

  • PTY devices: Blocking reads (timeout handled by pyserial)
  • Serial devices: Non-blocking withinWaiting() check (unchanged)

Detection method:

  1. Check device path:/dev/pts/[0-9]+ (Linux Unix98 PTY)
  2. Verify character device with major number 136
  3. Fall back to serial behavior if detection fails or errors

Addressing the Original Concern

The original commit0d46e45a1 introduced theinWaiting() check to solve a legitimate problem:

"If the target does not return any data thenread_until() will block indefinitely."

Our fix respects this concern by relying on pyserial's timeout mechanism:

  1. interCharTimeout: 1 - Blocking reads return empty after 1 second if no data arrives
  2. read_until() timeout checks - Function returns on timeout (default 10s)
  3. No indefinite blocking - Even if target sends no data, reads will timeout

The difference is:

  • Serial devices: Still use non-blocking reads withinWaiting() check (commit0d46e45 behavior preserved)
  • PTY devices: Use blocking reads with pyserial timeout (avoidsinWaiting() timing issues)

Both approaches prevent indefinite blocking - the PTY path just uses the timeout mechanism that was already present in the code.

Patch benefits:

  • ✅ Zero configuration required
  • ✅ No breaking changes
  • ✅ Graceful fallback on errors
  • ✅ No performance impact on serial devices
  • ✅ Preserves original fix for non-responsive targets

OpenMV Testing

The primary validation comes from OpenMV's production use case, which consistently exposed the issue:

Test environment:

  • Firmware: OpenMV vision processing firmware on QEMU MPS2-AN500
  • Test suite: 81 image processing tests
  • Duration: ~180 seconds continuous operation
  • Workload: Filesystem I/O, raw REPL, large data transfers, ML inference
  • Serial traffic: Heavy (image data transfer)

Results:

  • Without fix: Intermittent hangs requiring manual patch
  • With fix: 81/81 tests passed (100%)

OpenMV has been maintaining a manual patch for this issue:tools/mpremote-qemu-serial.patch

This PR upstreams their fix with automatic detection, allowing them to remove the downstream patch.

Why Long-Running Tests Matter

The timing-dependent nature of this issue means:

  • Short tests (<1s): Often pass due to favorable timing
  • Long tests (>60s): Expose the race condition reliably

OpenMV's 180-second continuous workload provided the consistent reproduction needed to identify and fix the issue.

Compatibility

ConfigurationImpactBehavior
Linux PTY (/dev/pts/*)✅ FixedAuto-detected, uses blocking reads
Linux serial (/dev/ttyUSB*,/dev/ttyACM*)✅ No changeUsesinWaiting() as before
macOS/Windows PTY✅ Graceful fallbackPath mismatch → uses serial behavior
RFC2217 network serial✅ No changeNot a PTY path
Device errors✅ Safe fallbackAssumes serial device

Safety guarantees:

  • Detection failure → Falls back to current serial behavior
  • os.major() missing → Falls back (Python 3.3+ has it)
  • stat() errors → Falls back
  • No code paths removed → Zero breaking changes

Performance

AspectReal SerialPTY Device
Detection overhead~0.1ms (onestat call)~0.1ms (onestat call)
Read behaviorUnchangedBlocking (no 10ms sleep)
CPU usageUnchangedLower (no polling)
LatencyUnchangedImproved (immediate)

Serial devices: Zero performance impact - code path unchanged, still uses efficient non-blocking reads withinWaiting() checks.

PTY devices: Improved performance by eliminating 10ms sleep polling.

Platform Notes

Linux-specific: Current implementation detects Linux Unix98 PTY (major 136). Other platforms fall back to serial behavior.

Background

OpenMV (https://github.com/openmv/openmv) uses QEMU for CI/CD testing of their MicroPython-based vision firmware. They've maintained a manual patch to work around this issue. This PR upstreams their fix with automatic detection.

References

Checklist

  • Code follows MicroPython style guidelines
  • Tested with OpenMV QEMU (real-world validation)
  • No breaking changes
  • Backward compatible (graceful fallback)
  • Performance impact analyzed (zero for serial)
  • Pre-commit hooks passed

TL;DR: mpremote and pyboard.py can hang intermittently on QEMU PTY devices due to timing-dependentinWaiting() behavior. While official tests usually pass (short duration + timing workarounds mask it), long-running operations expose the issue reliably. This adds automatic PTY detection (path + device major number) and uses blocking reads for PTY while keeping non-blocking behavior for real serial. Eliminates timing dependency. Fully backward compatible with graceful fallback. Validated with OpenMV's 180s continuous workload (81/81 tests passed).

@codecov
Copy link

codecovbot commentedOct 25, 2025
edited
Loading

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.38%. Comparing base (27b7bf3) to head (510f259).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@##           master   #18327   +/-   ##=======================================  Coverage   98.38%   98.38%           =======================================  Files         171      171             Lines       22297    22297           =======================================  Hits        21936    21936             Misses        361      361

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report?Share it here.

🚀 New features to boost your workflow:
  • ❄️Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link

github-actionsbot commentedOct 25, 2025
edited
Loading

Code size report:

Reference:  py/objcode: Remove `mp_obj_code_t.lnotab` field from v2 preview. [e0a9b70]Comparison: tools/pyboard: Add automatic PTY device detection for QEMU. [merge of 510f259]  mpy-cross:    +0 +0.000%    bare-arm:    +0 +0.000% minimal x86:    +0 +0.000%    unix x64:    +0 +0.000% standard      stm32:    +0 +0.000% PYBV10     mimxrt:    +0 +0.000% TEENSY40        rp2:    +0 +0.000% RPI_PICO_W       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS  qemu rv32:    +0 +0.000% VIRT_RV32

@andrewleechandrewleech marked this pull request as draftOctober 25, 2025 22:37
@andrewleechandrewleech changed the titletools: Add automatic PTY device detection for QEMUmpremote: Add automatic PTY device detection for QEMUOct 25, 2025
@agatti
Copy link
Contributor

agatti commentedOct 26, 2025
edited
Loading

I believe you can test the RV32/RV64 targets with fake time tests that would validate those changes. Now that that#18234 has been merged, the QEMU target has the time module compiled in, sotime.sleep is available.

If I'm not mistaken the test runner terminates the whole lot if there isn't any output either within 60s since the test start, or if there's no output after 60s since the characters were seen on STDOUT. If it's the latter case, then it won't take long to come up with a test that loops N times waiting 50s on each iteration, to see if your changes work :)

@andrewleech
Copy link
ContributorAuthor

Thanks@agatti for the suggestion!

Branch rebased on upstream/master.

Validated the PTY fix with a timeout stress test on QEMU MPS2-AN385 (ARM Cortex-M7).

Test setup

Built the QEMU port:

cd mpy-cross&& makecd ../ports/qemumake submodulesmake BOARD=MPS2_AN385

Started QEMU with PTY serial redirection:

make BOARD=MPS2_AN385 run# Output: char device redirected to /dev/pts/36 (label serial0)

Ran test script via mpremote connected to the PTY:

mpremote connect /dev/pts/36 run test_pty_timeout.py
Test code (temporary, not committed)
"""One-off test to validate PTY communication with long delays.Tests that mpremote can read output from QEMU PTY devices evenwith delays approaching the 60s timeout threshold."""importtimeprint("Starting PTY timeout test")foriinrange(3):print(f"Iteration{i} starting")time.sleep(50)# 50s per iteration - just under 60s timeoutprint(f"Iteration{i} complete")print("Test completed successfully")

Results

Without fix (commit27b7bf3):

Starting PTY timeout testIteration 0 startingIteration 0 completeIteration 1 starting[hangs after 90 seconds]

mpremote times out during iteration 1 becauseinWaiting() returns 0 on PTY devices.

With fix (commit45aaf12):

Starting PTY timeout testIteration 0 startingIteration 0 completeIteration 1 startingIteration 1 completeIteration 2 startingIteration 2 completeTest completed successfully

All 3 iterations (150 seconds total) complete without timeout.

PTY devices used by QEMU don't reliably report inWaiting() status.This adds automatic PTY detection (Linux /dev/pts/* with major 136)and uses blocking reads for PTYs while maintaining non-blockingbehavior for real serial devices.Fixes intermittent hangs when running tests against QEMU targets.See PRmicropython#18327 for detailed analysis and validation.Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
andrewleech pushed a commit to andrewleech/micropython that referenced this pull requestNov 1, 2025
Applies the same PTY detection fix to pyboard.py as mpremote.The official test suite uses pyboard.py (tests/run-tests.py),so both tools need the fix.See PRmicropython#18327 for detailed analysis and validation.Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
@andrewleech
Copy link
ContributorAuthor

@iabdalkader this was one of my distractions that got in the way of updating the openmv unix port PR - I'd appreciate a review in the hope it'll fix the issue you currently patch for inhttps://github.com/openmv/openmv/blob/master/tools/mpremote-qemu-serial.patch?

@iabdalkader
Copy link
Contributor

iabdalkader commentedNov 1, 2025
edited
Loading

I'd appreciate a review

I don't much aboutmpremote internals to properly review it, but I did test it and can confirm it fixes the issue. Normally, and without my patch, tests that read files could take up to 20 or 30 seconds each (that is if they don't fail completely). With this patch, those tests run in 2-3 seconds similar to my patch that bypasses inwaiting. The total time for about ~80 tests is 60 seconds, Vs. 7minutes

FWIW I had a test case bundle here that reproduces the issue:#18234 (comment)

Note, it seems to make the CI tests here hang maybe.

andrewleech reacted with heart emoji

@dpgeorgedpgeorge added the toolsRelates to tools/ directory in source, or other tooling labelNov 3, 2025
@andrewleechandrewleech marked this pull request as ready for reviewNovember 4, 2025 02:53
@andrewleechandrewleech marked this pull request as draftNovember 4, 2025 02:53
Applies the same PTY detection fix to pyboard.py as mpremote.The official test suite uses pyboard.py (tests/run-tests.py),so both tools need the fix.See PRmicropython#18327 for detailed analysis and validation.Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
@andrewleechandrewleech marked this pull request as ready for reviewNovember 4, 2025 03:58
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

toolsRelates to tools/ directory in source, or other tooling

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@andrewleech@agatti@iabdalkader@dpgeorge@pi-anl

[8]ページ先頭

©2009-2025 Movatter.jp