Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-99127: Allow some features of syslog to the main interpreter only#99128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
corona10 merged 23 commits intopython:mainfromcorona10:gh-99127
Nov 29, 2022

Conversation

corona10
Copy link
Member

@corona10corona10 commentedNov 5, 2022
edited by bedevere-bot
Loading

@corona10
Copy link
MemberAuthor

(.oss) ➜  cpython git:(gh-99127) ✗ ./python.exe -mtest test_syslog -R 3:3Raised RLIMIT_NOFILE: 256 -> 10240:00:00 load avg: 3.65 Run tests sequentially0:00:00 load avg: 3.65 [1/1] test_syslogbeginning 6 repetitions123456......== Tests result: SUCCESS ==1test OK.Total duration: 704 msTests result: SUCCESS

Copy link
Contributor

@ronaldoussorenronaldoussoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Something to consider: what happens when multiple sub interpreters use the syslog extension. The current implementation always calls openlog(3) at least once before calling syslog(3), see the check in syslog_syslog_impl.

That results in interference between sub interpreters when in one of them the python code explicitly calls syslog.openlog and the other doesn't.

IMHO the check in syslog_syslog_impl is not necessary because it results in a call to openlog(3) with parameters that match the default behaviour of the syslog library. Removing it is a fairly minor behaviour change though (syslog.syslog will currently override any calls to openlog that were done in C before the first call to syslog.syslog and will stop doing that when removing the check in syslog_syslog_impl).

erlend-aasland reacted with eyes emoji
@corona10corona10 restored the gh-99127 branchNovember 5, 2022 13:57
@corona10corona10 reopened thisNov 5, 2022
@corona10
Copy link
MemberAuthor

corona10 commentedNov 5, 2022
edited
Loading

Something to consider: what happens when multiple sub interpreters use the syslog extension. The current implementation always calls openlog(3) at least once before calling syslog(3), see the check in syslog_syslog_impl.

Thanks that's why I got a headache which makes me close the issue but thanks to suggest the solution.

IMHO the check in syslog_syslog_impl is not necessary because it results in a call to openlog(3) with parameters that match the default behaviour of the syslog library.

Yeah, IIUC we should call openlog(3) anyway while calling syslog.syslog right?
Please let me know if I understood it wrongly.

@corona10
Copy link
MemberAuthor

corona10 commentedNov 5, 2022
edited
Loading

Yeah, IIUC we should call openlog(3) anyway while calling syslog.syslog right?

Or did you intended implicit call of openlog(3) while calling syslog(3)?

@corona10corona10 added the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelNov 6, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@corona10 for commit952ff46 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelNov 6, 2022
@corona10
Copy link
MemberAuthor

Failure tests from buildbot/PPC64LE Fedora were not related to this change.

@erlend-aaslanderlend-aasland removed their request for reviewNovember 6, 2022 19:09
@erlend-aasland
Copy link
Contributor

Unfortunately I don't have time to review this right now. Maybe@serhiy-storchaka can have a look? He recently did some work on this module.

@ronaldoussoren
Copy link
Contributor

Yeah, IIUC we should call openlog(3) anyway while calling syslog.syslog right?
Please let me know if I understood it wrongly.

At the C level calling openlog(3) before calling syslog(3) is not necessary, althoughthe spec is not entirely clear about this. Thelinux manual page does clearly mention that openlog(3) is optional, and matches what I remember from older Unix systems.

The call to openlog in syslog_syslog_impl basically resets the default system state (use the process name for the ident and log using the LOG_USER facility). Leaving that out should therefore be harmless, but this is technically a user-visible change of behaviour:

  • C code calls openlog(...) with same parameters
  • Python code calls syslog.syslog

Currently the Python code resets the values set in the first step, with my proposal it would no longer do this. I'd consider this a good change, but there's bound to be someone affected by this.

corona10 reacted with thumbs up emoji

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The stateless syslog C API doesn't seem to be designed to be consumed by multiple interpreters. I'm not sure that making S_ident_o and S_log_open global varibles per interpreter is safe.

{
_syslog_state *state = get_syslog_state(module);
if (state->S_log_open) {
closelog();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

What happens if interpreter A calls closelog() and interpreter B (same process!) still uses the Python syslog module, so after closelog() has been called?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

POSIX syslog will transparently callopenlog() with default values.

@corona10
Copy link
MemberAuthor

This limitation seems to be new in Python 3.12: can you document this incompatible change?
Also, can you please document in the Python syslog module documentation the behavior/limitations of sub-interpreters?

Oh I missed this, I will push the documentation ASAP.

@corona10
Copy link
MemberAuthor

@vstinner Updated! PTAL

@ericsnowcurrently
Would you like to take a look at the documentation update?
I am not sure it will be natural for native speakers :)

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM. I'm not good to review English, but the code LGTM :-)


* :func:`syslog.openlog` and :func:`syslog.closelog` are only available from the main interpreter not subinterpreter.
:func:`syslog.syslog` is only available to subinterpreters if :func:`syslog.openlog` was already called from the main interpreter.
(Contributed by Dong-hee Na in :gh:`99127`.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hum, I think that the "Porting to Python 3.12" is a better section for these changes. The "syslog" section is more for new features.

corona10 reacted with thumbs up emoji
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why can syslog.syslog only be used when the main interpreter has called syslog.openlog? That's a restriction that is not present in the underlying library.

Copy link
MemberAuthor

@corona10corona10Nov 18, 2022
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@ronaldoussoren cc@ericsnowcurrently
This might be the discussion part for this change, from the view of managing for open/close state, subinterpreter should not intervene in changing the state of open/close. And this is whysyslog.openlog andsyslog.closelog are not available to the subinterpreter side. so if the subinterpreter can callsyslog.syslog() without involving the main interpreter, it means that subinterpreter intervenes in the state of open/close so I think that it should be prevented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Calling openlog (and closelog) is optional in syslog(3) with sensible defaults. There is no need to call openlog at all unless a program wants to change some of the default settings, and that could be done outside of the Python world.

For a use-case like embedding Python in a web server (likemod_python it is more likely that the embedding program does the syslog setup than that this is done in Python code.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Calling openlog (and closelog) is optional in syslog(3) with sensible defaults. There is no need to call openlog at all unless a program wants to change some of the default settings, and that could be done outside of the Python world.

In summary for your suggestion might be:

  • syslog.syslog(): Allow for both the main interpreter and subinterpreters at any condition.
  • syslog.openlog()/closelog(): Allow only for the main interpreter

But if the subinterpreter callssyslog.syslog() after the main interpreter calls thesyslog.closelog(), the subinterpreter will neutralize what the main interpreter did. Who will close thesyslog.syslog() that is opened by subinterpreter?

cc@ericsnowcurrently

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There are two factors here:

  1. most process-global resources should be managed by the app
  2. we are storing a str object in a static variable (S_ident_o), which is tricky when multiple interpreters are involved

Regarding the first one, this is perhaps something that we need not be so strict about. In Python, "app" means the__main__ module, running in the main interpreter (starting in the main thread). This implies that the global resources modifed viasyslog.openlog() andsyslog.closelog() should only be managed in the main interpreter. This would be similar to how signal handlers are supported only in the main thread of the main interpreter. However, this could be an overly strict interpretation for syslog. It makes sense to me, but "consenting adults", etc.

Regarding the second factor, we have been (and still are) working hard to get interpreters isolated from one another, especially to avoid problems that arise when they step on each others' toes. [1] (Isolation also provides new implementation opportunities.) For example, an object created in one interpreter should never be modified (even just the refcount) by another interpreter. [2] In the case of syslog, the object in the static variable (set insyslog.openlog()) should be managed only by the interpreter that owns it. Here are ways to enforce this:

  • the interpreter that callsopenlog() should be the only one allowed to callcloselog() (simplest way is to restrict to the main interpreter)
  • if another interpreter callscloselog(), then it would callPy_AddPendingCall() to have the owning interpreter decref the string

Obviously we went with the simplest form of the first one.

On top of the existing isolation concerns, there are additional ones that become more complicated under a per-interpreter GIL due to thread-safety. These are a non-issue if we restrict to a per-interpreter GIL.

[1] Such problems would be amplified by a per-interpreter GIL (see PEP 684, which hasnot been accepted yet), but they still exist even now.
[2] Except for a small set of objects managed by the process-global runtime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Regardless, if the concern is backward compatibility, we can easily apply the restriction only if the subinterpreter was created viaPy_NewInterpreter() (or, under a per-interpreter GIL, is otherwise sharing the GIL with the main interpreter).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Calling openlog (and closelog) is optional in syslog(3) with sensible defaults. There is no need to call openlog at all unless a program wants to change some of the default settings, and that could be done outside of the Python world.

For a use-case like embedding Python in a web server (likemod_python it is more likely that the embedding program does the syslog setup than that this is done in Python code.

FWIW, both concerns are valid, but pre-date this PR.

Copy link
Member

@ericsnowcurrentlyericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've left some recommendations for the docs.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment:I have made the requested changes; please review again.

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
@corona10
Copy link
MemberAuthor

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ericsnowcurrently,@vstinner: please review the changes made to this pull request.

@vstinner
Copy link
Member

@ronaldoussoren:

Why can syslog.syslog only be used when the main interpreter has called syslog.openlog? That's a restriction that is not present in the underlying library.

The limitation comes from Python which stores a Python object in a global variable:

/*  only one instance, only one syslog, so globals should be ok  */static PyObject *S_ident_o = NULL;                      /*  identifier, held by openlog()  */static char S_log_open = 0;

We must hold a strong reference because we pass a pointer into the internal UTF-8 encoded flavor of the Unicode string, PyUnicode_AsUTF8():

    /* At this point, ident should be INCREF()ed.  openlog(3) does not     * make a copy, and syslog(3) later uses it.  We can't garbagecollect it.     * If NULL, just let openlog figure it out (probably using C argv[0]).     */    if (ident) {        ident_str = PyUnicode_AsUTF8(ident);        if (ident_str == NULL) {            Py_DECREF(ident);            return NULL;        }    }

I understand that openlog() doesn't copy the ident string. Maybe this assumption is wrong and the whole change is not needed. I don't know. I don't want to check the implementation of openlog() on all supported platforms, and run a real test.

S_ident_o global variable was added in 1998 by commitae94cf2. Previously, it was astatic PyObject *ident_o = NULL; in openlog() added in 1995 by commitc1822a4.

Ok, I'm curious and I looked into the GNU glibc:https://sourceware.org/git/?p=glibc.git;a=blob;f=misc/syslog.c;h=f67d4b58a448cabdf63f629f72a9df6e009286cf;hb=HEAD#l295

The ident string is not copied, only the pointer is copied (LogTag = ident;). So if the pointer becomes a dangling pointer, syslog() will just crash when reading ident (LogTag)

@vstinner
Copy link
Member

One problem with sub-interpreters is that Python objects should not travel from one interpreter to another. Butstatic PyObject *S_ident_o can become a memory copy of the UTF-8 encoded string instead, so we don't pass objects anymore.

For closelog(), I proposed earlier a counter increased by syslog.openlog() and decreased by syslog.closelog(). The C function closelog() would only be called when the counter reach zero. One problem is if two interepreters use different identifiers, the latest call to syslog.openlog() wins: it overrides the previous ident (and log level).

os.chdir() has a similar problem: it's a process-wide parameter affecting all threads and all interpreters. But it's a simpler problem. The current working directory is stored in the kernel, not in Python, and no resource have to be released explicitly at exit :-)

serhiy-storchaka reacted with thumbs up emoji

@serhiy-storchaka
Copy link
Member

Agree, keeping a copy of ident in the raw dynamically allocated array will resolve the issue.

But I do not think we need a reference counter. Just call closelog() and clear the saved ident string. If your program callssyslog.closelog() multiple times, it perhaps has a trouble. But at least this will not cause a crash.

@ericsnowcurrently
Copy link
Member

What about keeping a static buffer and copying the str object's C string into it, then giving the buffer toopenlog()? Then there's no deallocation to fuss with, we can dropS_log_open and bothsyslog.syslog() andsyslog.closelog() become simpler. I'm pretty sure the only downside is that we would introduce a hard-limit on the length of the identifier.

#defineMAX_IDENT 256staticcharS_ident[MAX_IDENT+1];...staticPyObject*syslog_openlog_impl(...){    ....constchar*ident_str=NULL;if (ident!=NULL) {Py_ssize_tsize=PyUnicode_GET_LENGTH(ident);if (size>MAX_IDENT) {PyErr_SetString(...);returnNULL;        }elseif (size>0) {strncpy(S_ident,PyUnicode_AsUTF8(ident),size);ident_str=S_ident;        }    }openlog(ident_str,logopt,facility);Py_RETURN_NONE;}staticPyObject*syslog_syslog_impl(...){if (PySys_Audit("syslog.syslog","is",priority,message)<0) {        ...    }#ifdef__APPLE__// gh-98178: On macOS, libc syslog() is not thread-safesyslog(priority,"%s",message);#elsePy_BEGIN_ALLOW_THREADS;syslog(priority,"%s",message);Py_END_ALLOW_THREADS;#endifPy_RETURN_NONE;}staticPyObject*syslog_closelog_impl(PyObject*module){    ...closelog();Py_RETURN_NONE;}

There would still be a race under per-interpreter GIL, though, if subinterpreters are allowed to callsyslog.openlog() and we don't have a process-global lock for the module.

@corona10
Copy link
MemberAuthor

@ronaldoussoren@vstinner@ericsnowcurrently

As the first step for the subinterpreter project, I would like to suggest maintaining restrictions for subinterpreter as the current PR.
If there are requests which require allowing all of the features from subinterpreter side, let's discuss the way to support it.
Eric's suggestion might be a good way to solve it but let's delay the decision for this moment.

ericsnowcurrently reacted with thumbs up emoji

Copy link
Member

@ericsnowcurrentlyericsnowcurrently left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

This is a good baseline. I'm okay with circling back with improvements (or loosening restrictions) afterward.

corona10 reacted with heart emoji
@corona10corona10 merged commit8bb2303 intopython:mainNov 29, 2022
@corona10corona10 deleted the gh-99127 branchNovember 29, 2022 22:58
carljm added a commit to carljm/cpython that referenced this pull requestDec 1, 2022
* main: (112 commits)pythongh-99894: Ensure the local names don't collide with the test file in traceback suggestion error checking (python#99895)pythongh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (pythonGH-99613)  Doc: Add summary line to isolation_level & autocommit sqlite3.connect params (python#99917)pythonGH-98906 ```re``` module: ```search() vs. match()``` section should mention ```fullmatch()``` (pythonGH-98916)pythongh-89189: More compact range iterator (pythonGH-27986)  bpo-47220: Document the optional callback parameter of weakref.WeakMethod (pythonGH-25491)pythonGH-99905: Fix output of misses in summarize_stats.py execution counts (pythonGH-99906)pythongh-99845: PEP 670: Convert PyObject macros to functions (python#99850)pythongh-99845: Use size_t type in __sizeof__() methods (python#99846)pythonGH-99877)  Fix typo in exception message in `multiprocessing.pool` (python#99900)pythongh-87092: move all localsplus preparation into separate function called from assembler stage (pythonGH-99869)pythongh-99891: Fix infinite recursion in the tokenizer when showing warnings (pythonGH-99893)pythongh-99824: Document that sqlite3.connect implicitly open a transaction if autocommit=False (python#99825)pythonGH-81057: remove static state from suggestions.c (python#99411)  Improve zip64 limit error message (python#95892)pythongh-98253: Break potential reference cycles in external code worsened by typing.py lru_cache (python#98591)pythongh-99127: Allow some features of syslog to the main interpreter only (pythongh-99128)pythongh-82836: fix private network check (python#97733)  Docs: improve accuracy of socketserver reference (python#24767)  ...
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@serhiy-storchakaserhiy-storchakaserhiy-storchaka left review comments

@kumaraditya303kumaraditya303kumaraditya303 left review comments

@ericsnowcurrentlyericsnowcurrentlyericsnowcurrently approved these changes

@ronaldoussorenronaldoussorenAwaiting requested review from ronaldoussoren

@vstinnervstinnerAwaiting requested review from vstinner

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

8 participants
@corona10@bedevere-bot@erlend-aasland@ronaldoussoren@ericsnowcurrently@vstinner@serhiy-storchaka@kumaraditya303

[8]ページ先頭

©2009-2025 Movatter.jp