Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-106939, gh-145261: Fix ShareableList data corruption#145488

Open
jakelodwick wants to merge 1 commit intopython:mainfrom
jakelodwick:fix-shareablelist-corruption
Open

gh-106939, gh-145261: Fix ShareableList data corruption#145488
jakelodwick wants to merge 1 commit intopython:mainfrom
jakelodwick:fix-shareablelist-corruption

Conversation

@jakelodwick
Copy link

@jakelodwickjakelodwick commentedMar 4, 2026
edited by github-actionsbot
Loading

ShareableList has two data corruption bugs, both rooted in the same design flaw: C-style null-terminated storage semantics applied to length-delimited Python types.

Bug 1 — UTF-8 underallocation (#145261,#88336): Slot allocation useslen(item) (character count) instead of byte count forstr items. Multi-byte UTF-8 strings overflow their allocated slot and corrupt adjacent data.

Bug 2 — Null stripping (#106939,#96779): The back-transform lambdas callrstrip(b'\x00') to remove struct padding, but this also strips legitimate trailing null bytes from user data.

Approach

Store actual byte lengths in the format metadata (separate from the allocated slot sizes used forstruct.pack_into), and use those exact lengths during retrieval instead of relying on null-termination. This makes both bugs go away with a single conceptual change.

Specifically:

  • __init__: compute slot allocation usinglen(item.encode('utf-8')) forstr items
  • __init__: build_stored_formats list with actual byte lengths, write those to packing metadata
  • __setitem__: separatepack_format (allocated slot size) fromnew_format (actual byte length in metadata)
  • _back_transforms_mapping: removerstrip(b'\x00')

Cross-version compatibility

  • Old writer → new reader: The old code stores allocated-length values in format metadata (e.g.8s for a 5-byte string in an 8-byte slot). New code reads that8s format and returns 8 bytes including padding nulls. This is the same behavior as the current release — no regression.
  • New writer → old reader: New code stores actual byte length (e.g.5s). Old code reads5s, gets 5 bytes, thenrstrip(b'\x00') — harmless unless the data actually ends in nulls, which is the existing bug.
  • Neither direction crashes. The shared memory layout (offsets, block sizes) is unchanged.

Tests

  • test_shared_memory_ShareableList_trailing_nulls: bytes with trailing nulls, str with trailing nulls, all-null bytes, empty bytes, no-null bytes, cross-process read vianame=
  • test_shared_memory_ShareableList_multibyte_utf8: 1-byte (ASCII), 2-byte (é), 3-byte (中), and 4-byte (𐀀) UTF-8 sequences with cross-process verification
  • Updated existingsl.format assertions to reflect actual byte lengths

Prior work

This PR consolidates the approaches from#144559 (@aisk, null-stripping fix) and#145266 (@zetzschest, both fixes). Both PRs are open with zero reviews. The fixes belong together because they share the same root cause and the same solution mechanism (stored byte lengths in format metadata).


📚 Documentation preview 📚:https://cpython-previews--145488.org.readthedocs.build/

Store actual byte lengths in format metadata instead of allocatedslot sizes, so retrieval extracts exact data without relying onnull-termination. Use byte count instead of character count forstr slot allocation to prevent multi-byte UTF-8 overflow.
@python-cla-bot
Copy link

The following commit authors need to sign the Contributor License Agreement:

CLA not signed

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@gpsheadgpsheadAwaiting requested review from gpsheadgpshead is a code owner

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@jakelodwick

[8]ページ先頭

©2009-2026 Movatter.jp