Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-126554: ctypes: Correctly handle NULL dlsym values#126555

Merged
encukou merged 54 commits intopython:mainfrom
grgalex:fix-issue-126554
Nov 15, 2024
Merged

gh-126554: ctypes: Correctly handle NULL dlsym values#126555
encukou merged 54 commits intopython:mainfrom
grgalex:fix-issue-126554

Conversation

@grgalex
Copy link
Contributor

@grgalexgrgalex commentedNov 7, 2024
edited
Loading

For dlsym(), a return value of NULL does not necessarily indicate an error [1].

Therefore, to avoid using stale (or NULL) dlerror() values, we must:

  1. clear the previous error state by calling dlerror()
  2. call dlsym()
  3. call dlerror()

If the return value of dlerror() is not NULL, an error occured.

In our case of ctypes, I understand that we subjectively treat a NULL return value from dlsym() as an error, so we must format a special string for that.

[1]https://man7.org/linux/man-pages/man3/dlsym.3.html

Sapemeg reacted with rocket emoji
@ghost
Copy link

ghost commentedNov 7, 2024
edited by ghost
Loading

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply theskip news label instead.

For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In our case, we also subjectively treat a NULL return valuefrom dlsym() as an error, so we format a special string for that.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.htmlSigned-off-by: Georgios Alexopoulos <grgalex42@gmail.com>
@efimov-mikhail
Copy link
Member

On success, these functions return the address associated with
symbol. On failure, they return NULL; the cause of the error can
be diagnosed usingdlerror(3).

This is direct quotation from the man page you linked.
Why you think this is correct?

dlsym(), a return value of NULL does not necessarily indicate an error [1].

@grgalex
Copy link
ContributorAuthor

grgalex commentedNov 7, 2024
edited
Loading

@efimov-mikhail

Indeed, I failed to mention that at some point the man page contradicts itself, and I believe that the part you quoted is something the maintainer forgot to remove.

[edit: added this part]

I say this because further up, in the DESCRIPTION, the man page states:

In unusual cases (see NOTES) the value of the symbol couldactually be NULL.  Therefore, a NULL return from dlsym() need notindicate an error.  The correct way to distinguish an error froma symbol whose value is NULL is to call dlerror(3) to clear anyold error conditions, then call dlsym(), and then call dlerror(3)again, saving its return value into a variable, and check whetherthis saved value is not NULL.

[end of edit]

Additionally, further down, in the NOTES segment, the man page states:

There are several scenarios when the address of a global symbolis NULL.  For example, a symbol can be placed at zero address bythe linker, via a linker script or with --defsym command-lineoption.  Undefined weak symbols also have NULL value.  Finally,the symbol value may be the result of a GNU indirect function(IFUNC) resolver function that returns NULL as the resolvedvalue.  In the latter case, dlsym() also returns NULL withouterror.  However, in the former two cases, the behavior of GNUdynamic linker is inconsistent: relocation processing succeedsand the symbol can be observed to have NULL value, but dlsym()fails and dlerror() indicates a lookup error.

@efimov-mikhail
Copy link
Member

efimov-mikhail commentedNov 7, 2024
edited
Loading

Understood. IMO, it would be better if we can make an actual reproducible test case, and put this case into regular testing procedure.

@grgalex
Copy link
ContributorAuthor

See reproducer here:#126554 (comment)

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It's good to see new contributors :)

Unfortunately, this change does need a test. If it's really,really difficult to add one, then I guess we'll figure things out, but in general, it's difficult for us to accept eventiny bugfixes without proper tests.

For something like this, as long as the test works, it's probably OK. You could do something as hacky as hard-coding a byte string into a test file and unpacking that into a.so at runtime, and just disable all the systems that don't work. But it would be preferable if you could find a library that actually triggers this problem in the wild.

grgalexand others added3 commitsNovember 8, 2024 15:15
…2eb.rstCo-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
@grgalex
Copy link
ContributorAuthor

@ZeroIntensity

  1. I understand that I can just click "Commit" on your proposed changes, and then we can squash before merging
  2. With some grepping I found out that we probably also need to change
    ptr=dlsym((void*)handle,name);

    using the same approach.
  3. Regarding the test case, I'm thinking something similar tohttps://github.com/python/cpython/blob/b19d12f44735583e339ee6ee560588fcf22633b4/Lib/test/test_ctypes/test_find.py, where we will:
    1. Write a fixed string (the content offoo.c from the reproducer),
    2. Compile (as does thetest_find testcase) into a shared lib
    3. Load, and access thefoo object.
efimov-mikhail and ZeroIntensity reacted with thumbs up emoji

@grgalex
Copy link
ContributorAuthor

(2) is done, pushing a commit. I also added a comment that explicitly states why we calldlerror() beforedlsym().

From a project maintaner's POV, if we have 3 instances of this phenomenon, do we add a comment disclaimer at every instance? (This is the approach I took, but I'm unsure what the best / most common practice is.)

@grgalex
Copy link
ContributorAuthor

Proceeding with the test case (3), will let you know when ready.

Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>
@ZeroIntensity
Copy link
Member

  • I understand that I can just click "Commit" on your proposed changes, and then we can squash before merging

Yup. In fact, that's the preferred workflow for us. Force pushing onto one commit is very much discouraged.

2. With some grepping I found out that we probably also need to change

That looks like a problem too, yeah. Thanks!

I'll create a separate issue if I can come up with a repro for the locale issue. (I'll CC you on it, if you want to author that PR as well.)

Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>
@grgalex
Copy link
ContributorAuthor

Added a commit with the test case undertest_ctypes/test_dlerror. PTAL.

Running on the PR branch, it passes :)

On HEAD of main, it explodes (segfault).

Do we have any way of handling this gracefully (maybe running a subprocess with the interpreter?), or is it the correct way to go?

@efimov-mikhail
Copy link
Member

Do we have any way of handling this gracefully (maybe running a subprocess with the interpreter?)

Could you please explain, what exactly are you speaking about?
Gracefully handling what?

@grgalex
Copy link
ContributorAuthor

grgalex commentedNov 8, 2024
edited
Loading

If we want to be pedantic, we must check all 3 code paths that invokedlsym().

In this case, we only check one (PyCFuncPtr_FromDll).

On one hand, since our amendment is the same across all 3 funcs, we should theoretically be OK.

@grgalex
Copy link
ContributorAuthor

grgalex commentedNov 8, 2024
edited
Loading

Do we have any way of handling this gracefully (maybe running a subprocess with the interpreter?)

Could you please explain, what exactly are you speaking about? Gracefully handling what?

@efimov-mikhail

If I run the test on the main branch, the Python interpreter gets a segmentation fault.

What I'm saying is that in a scenario where someone runs all tests, when the./python -m test process encounters this test, it will exit with a segmentation fault and terminate the testing procedure. It won't mark the test as failing. It will explode.

Maybe I have misunderstood something regarding the philosophy of testing.

@ZeroIntensity
Copy link
Member

Run the test in a subprocess and check the return code.

grgalexand others added2 commitsNovember 14, 2024 15:43
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>
@ZeroIntensity
Copy link
Member

I'm not sure if the ASan failures are related.

Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

OnePEP-7 nitpick and two de-chop suggestions. I didn't write suggestions for thePyErr_SetString usages since I wasn't sure it fits on 80 chars.

grgalex reacted with thumbs up emoji
grgalexand others added2 commitsNovember 14, 2024 18:42
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>
@ZeroIntensityZeroIntensity added the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelNov 14, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@ZeroIntensity for commitab22349 🤖

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelNov 14, 2024
@ZeroIntensity
Copy link
Member

ZeroIntensity commentedNov 14, 2024
edited
Loading

Let's try the build bots again and see if it holds up now.

@ZeroIntensity
Copy link
Member

ZeroIntensity commentedNov 14, 2024
edited
Loading

Buildbots are all good, the one failure is a problem with the DRC API that got merged the other day, which is unrelated. I'll submit a (separate) patch to fix that. I think we're good to merge!

@encukouencukou merged commit8717f79 intopython:mainNov 15, 2024
@miss-islington-app
Copy link

Thanks@grgalex for the PR, and@encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestNov 15, 2024
…-126555)For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In ctypes we choose to treat a NULL return value from dlsym()as a "not found" error. This is the same as the fallbackmessage we use on Windows, Cygwin or when getting/formattingthe error reason fails.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.html(cherry picked from commit8717f79)Co-authored-by: George Alexopoulos <giorgosalexo0@gmail.com>Signed-off-by: Georgios Alexopoulos <grgalex42@gmail.com>Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@miss-islington-app
Copy link

Sorry,@grgalex and@encukou, I could not cleanly backport this to3.12 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 8717f792f7cc343599dc60daf80630cceb66145b 3.12

@bedevere-app
Copy link

GH-126861 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelNov 15, 2024
@efimov-mikhail
Copy link
Member

@grgalex, you've finished it! Congrats with your first contribution to CPython!

grgalex reacted with heart emojiZeroIntensity and encukou reacted with rocket emoji

@grgalex
Copy link
ContributorAuthor

Thanks for the guidance@efimov-mikhail,@ZeroIntensity,@encukou.

Looking forward to collaborating again in the future!

ZeroIntensity and efimov-mikhail reacted with heart emoji

ambv pushed a commit that referenced this pull requestNov 17, 2024
…) (#126861)For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In ctypes we choose to treat a NULL return value from dlsym()as a "not found" error. This is the same as the fallbackmessage we use on Windows, Cygwin or when getting/formattingthe error reason fails.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.html(cherry picked from commit8717f79)Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>Co-authored-by: George Alexopoulos <giorgosalexo0@gmail.com>Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
picnixz added a commit to picnixz/cpython that referenced this pull requestDec 8, 2024
…-126555)For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In ctypes we choose to treat a NULL return value from dlsym()as a "not found" error. This is the same as the fallbackmessage we use on Windows, Cygwin or when getting/formattingthe error reason fails.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.htmlSigned-off-by: Georgios Alexopoulos <grgalex42@gmail.com>Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@picnixz
Copy link
Member

3.13 backports were done but not 3.12;@encukou Do you want the 3.12 backport in the end or can we close the issue with only 3.13 and 3.14 being affected?

@encukou
Copy link
Member

AsI said, I'm fine not backporting. If anyone disagrees, open the PR :)

@bedevere-app
Copy link

GH-127764 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelDec 9, 2024
serhiy-storchaka pushed a commit that referenced this pull requestDec 17, 2024
…) (GH-127764)For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In ctypes we choose to treat a NULL return value from dlsym()as a "not found" error. This is the same as the fallbackmessage we use on Windows, Cygwin or when getting/formattingthe error reason fails.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.htmlSigned-off-by: Georgios Alexopoulos <grgalex42@gmail.com>Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>Co-authored-by: George Alexopoulos <giorgosalexo0@gmail.com>Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
ebonnal pushed a commit to ebonnal/cpython that referenced this pull requestJan 12, 2025
…-126555)For dlsym(), a return value of NULL does not necessarily indicatean error [1].Therefore, to avoid using stale (or NULL) dlerror() values, we must: 1. clear the previous error state by calling dlerror() 2. call dlsym() 3. call dlerror()If the return value of dlerror() is not NULL, an error occured.In ctypes we choose to treat a NULL return value from dlsym()as a "not found" error. This is the same as the fallbackmessage we use on Windows, Cygwin or when getting/formattingthe error reason fails.[1]:https://man7.org/linux/man-pages/man3/dlsym.3.htmlSigned-off-by: Georgios Alexopoulos <grgalex42@gmail.com>Signed-off-by: Georgios Alexopoulos <grgalex@ba.uoa.gr>Co-authored-by: Peter Bierma <zintensitydev@gmail.com>Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@encukouencukouencukou approved these changes

@efimov-mikhailefimov-mikhailefimov-mikhail approved these changes

@picnixzpicnixzpicnixz approved these changes

@ZeroIntensityZeroIntensityZeroIntensity approved these changes

Assignees

@encukouencukou

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

6 participants

@grgalex@efimov-mikhail@ZeroIntensity@encukou@bedevere-bot@picnixz

Comments


[8]ページ先頭

©2009-2026 Movatter.jp