Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-96735: Fix undefined behaviour in struct unpacking functions#96739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

mdickinson
Copy link
Member

@mdickinsonmdickinson commentedSep 10, 2022
edited
Loading

This PR fixes undefined behaviour in the struct module unpacking support functionsbu_longlong,lu_longlong,bu_int andlu_int; thanks to@kumaraditya303 for finding these.

The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt:https://godbolt.org/z/5zvxodj64 .)

To make the conversions efficient, I've specialised the relevant functions for their output size: forbu_longlong andlu_longlong, this only entails checking that the output size is indeed8. Butbu_int andlu_int were used for format sizes2 and4 - I've split those into two separate functions each.

No tests, because all of the affected cases are already exercised by the test suite.

matthiasgoergens reacted with heart emoji
@mdickinson

This comment was marked as outdated.

@mdickinsonmdickinson changed the titlegh-96735: Fix undefined behaviour in bu_ulonglonggh-96735: Fix undefined behaviour in struct unpacking functionsSep 10, 2022
@mdickinson
Copy link
MemberAuthor

It'svery tempting to convert all of these functions to use fixed-width integer types (uint16,uint32,uint64), but that seems like too invasive a change for a bugfix PR. It's something we could possibly do for 3.12 only.

erlend-aasland and matthiasgoergens reacted with thumbs up emoji

Copy link
Contributor

@kumaraditya303kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, I tested this and indeed it fixes the UB.

Thanks for fixing the other ones too, just noticed that those were also affected.

@kumaraditya303
Copy link
Contributor

It's very tempting to convert all of these functions to use fixed-width integer types (uint16, uint32, uint64), but that seems like too invasive a change for a bugfix PR. It's something we could possibly do for 3.12 only.

I agree, it would be better but best left for 3.12 only.

@mdickinsonmdickinson added the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelSep 13, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@mdickinson for commit1294440 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelSep 13, 2022
@mdickinson
Copy link
MemberAuthor

@kumaraditya303 I've updated this PR to be more efficient, while still avoiding undefined behaviour and implementation-defined behaviour (specifically, the conversion from an unsigned type to the corresponding signed type when the value being converted is not representable in the signed type; cf. C99 §6.3.1.3p3).

The sign-extension is now branch free (and should compile to a no-op in the case that the C type being used has exactly the correct width), and the conversion from unsigned to signed should always compile to a no-op on any semi-reasonable compiler.

Godbolt example for the case where sign extension is needed:https://godbolt.org/z/e8roKrn6r
Godbolt example for the case where no sign extension is needed:https://godbolt.org/z/bPhbsT9Kn

kumaraditya303 reacted with thumbs up emoji

@matthiasgoergens
Copy link
Contributor

matthiasgoergens commentedSep 14, 2022
edited
Loading

We are currently compiling all non-pydebug builds with-fwrapv (at least on gcc and clang). That means the shenanigans going on in_struct.c are actually perfectly defined behaviour.

I don't think we will turn off-fwrapv for 3.11. So there's no need for a stopgap solution.

I'd say: just do the Right Thing for 3.12 inmain, and don't worry about 3.11. Especially if the Right Thing isvery tempting to do anyway.

See also#96821 and#96678 (comment)

As a minimal change for 3.11, I would suggest enabling-fwrapv even when giving--with-pydebug.

That being said, I'm not opposed to this PR. It's a great start to getting everything safe for-fstrict-overflow.

@mdickinson
Copy link
MemberAuthor

I'd say: just do the Right Thing for 3.12 inmain, and don't worry about 3.11. Especially if the Right Thing isvery tempting to do anyway.

Agreed; I think I'll merge this for 3.12 only.

matthiasgoergens reacted with rocket emoji

@mdickinsonmdickinson removed the needs backport to 3.10only security fixes labelSep 25, 2022
@mdickinsonmdickinson removed the needs backport to 3.11only security fixes labelSep 25, 2022
@mdickinson
Copy link
MemberAuthor

Some not very rigorous timings, on macOS/Intel, non-optimised non-debug build. (This PR is not primarily about performance, but it would be unfortunate if it caused a significant performance regression.)

On this branch:

mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"5000000 loops, best of 5: 92.6 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"5000000 loops, best of 5: 94.3 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"5000000 loops, best of 5: 99.1 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"5000000 loops, best of 5: 98.3 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"2000000 loops, best of 5: 99.1 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"2000000 loops, best of 5: 102 nsec per loop

On the main branch at commit6281aff:

mdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"  5000000 loops, best of 5: 94.9 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>hhh'); v=b'\x1b\xe0\xcfKSh'" "s.unpack(v)"  5000000 loops, best of 5: 98.2 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"2000000 loops, best of 5: 99.1 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>iii'); v=b'\x1b\xe0\xcfKSh'*2" "s.unpack(v)"2000000 loops, best of 5: 104 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('<qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"2000000 loops, best of 5: 101 nsec per loopmdickinson@lovelace cpython % ./python.exe -m timeit -s "import struct; s = struct.Struct('>qqq'); v=b'\x1b\xe0\xcfKSh'*4" "s.unpack(v)"2000000 loops, best of 5: 112 nsec per loop

@mdickinsonmdickinson merged commitf5f047a intopython:mainSep 25, 2022
@mdickinsonmdickinson deleted the fix-struct-unpack-undefined-behaviour branchSeptember 25, 2022 09:55
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@kumaraditya303kumaraditya303kumaraditya303 approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@mdickinson@kumaraditya303@bedevere-bot@matthiasgoergens

[8]ページ先頭

©2009-2025 Movatter.jp