
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2017-05-23 14:35 byosantana, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| fix_os_environ_iteration_issue.diff | osantana,2017-05-23 14:35 | patch with fix and test | ||
| fix_os_environ_iter_issue_low_level_thread_lock.diff | osantana,2017-05-24 22:37 | (use this) use _thread.allocate_lock() as lock provider | ||
| fix_os_environ_iter_issue_threading_lock.diff | osantana,2017-05-24 22:46 | |||
| fix_os_environ_iter_issue_low_level_thread_lock_2.diff | osantana,2017-05-25 17:08 | |||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 2409 | merged | osantana,2017-06-26 14:32 | |
| PR 2556 | merged | serhiy.storchaka,2017-07-04 04:22 | |
| PR 2557 | merged | serhiy.storchaka,2017-07-04 04:24 | |
| Messages (27) | |||
|---|---|---|---|
| msg294250 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-23 14:35 | |
We're facing ocasional RuntimeError exceptions in a multithreaded application when one of the threads creates new entries to the environment (os.environ).I'm not so sure if the attached patch fixes this issue the right way, so, feel free to propose another approach.Traceback Sample of the issue in our production environment:RuntimeError: dictionary changed size during iteration File "python3.5/runpy.py", line 170, in _run_module_as_main "__main__", mod_spec) File "python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) [ internal code ] File "newrelic/api/background_task.py", line 103, in wrapper return wrapped(*args, **kwargs) File "python3.5/contextlib.py", line 30, in inner return func(*args, **kwds) File "python3.5/contextlib.py", line 77, in __exit__ self.gen.throw(type, value, traceback) File "raven/base.py", line 851, in make_decorator yield File "python3.5/contextlib.py", line 30, in inner return func(*args, **kwds) [ internal code ] File "retry/api.py", line 74, in retry_decorator logger) File "retry/api.py", line 33, in __retry_internal return f() [ internal code ] File "requests/sessions.py", line 531, in get return self.request('GET', url, **kwargs) File "requests/sessions.py", line 509, in request prep.url, proxies, stream, verify, cert File "requests/sessions.py", line 686, in merge_environment_settings env_proxies = get_environ_proxies(url, no_proxy=no_proxy) File "requests/utils.py", line 696, in get_environ_proxies return getproxies() File "urllib/request.py", line 2393, in getproxies_environment for name, value in os.environ.items(): File "_collections_abc.py", line 676, in __iter__ for key in self._mapping: File "python3.5/os.py", line 702, in __iter__ for key in self._data: | |||
| msg294257 -(view) | Author: Jelle Zijlstra (JelleZijlstra)*![]() | Date: 2017-05-23 15:47 | |
Even with the patch, I don't think it's safe to modify os.environ while it's being accessed concurrently in another thread. The other thread's modification could arrive while the dict() call in your patch is running (in CPython the GIL might protect you, but that's an implementation detail).I think the real solution is that your application uses a lock or some other concurrency mechanism to protect access to os.environ. | |||
| msg294261 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-23 16:29 | |
I agree with Jelle about the fix's implementation. But I disagree of suggestion to implement lock at the application side.I disagree because some external libraries may want to access os.environ in a concurrent fashion without lock acquiring. That's the case reported in my previous comment: urllib/requests.py (https://github.com/python/cpython/blob/master/Lib/urllib/request.py#L2468) iterates over os.environ with no lock control.How could I handle this issue in this case? Would it be a good idea implement the lock mechanism in current os._Environ (https://github.com/python/cpython/blob/master/Lib/os.py#L666) mapping class? | |||
| msg294352 -(view) | Author: Martin Panter (martin.panter)*![]() | Date: 2017-05-24 12:48 | |
Previous report:Issue 25641. At least in Posix, the “putenv” function is not required to be thread safe. | |||
| msg294354 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-24 13:25 | |
There are exceptions being raised in many applications (as reported here and inhttp://bugs.python.org/issue25641) and there are three paths to follow: 1. We handle this exception somewhere;2. We avoid raising it;3. Just leave it. I don't care about this exception.If we chose to handle this exception we need to decide where we'll do it. At os.py module? At urllib/request.py? At all http client libraries (eg. python-requests)? At our applications?If we chose to avoid this exception we, probably, need to implement some kind of lock/mutex in os.environ.I can implement any of these options. You just need to decide which one is best.If we chose the third option, we can just close this issue. | |||
| msg294398 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-24 21:30 | |
The environ class could have its own lock and then the __iter__ method could be rewritten as follows: def __iter__(self): with self._lock: keys = list(self._data) for key in keys: yield self.decodekey(key) | |||
| msg294406 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-24 22:37 | |
This patch implements a lock protection (as suggested by Antoine) using `_thread.allocate_lock()` module (instead of `threading.Lock()`) due to the fact that CPython interpreter already imports `_thread` module during its bootstrap process. | |||
| msg294407 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-24 22:46 | |
This is an alternative implementation using `threading.Lock()`. The main issue with this implementation relies on the fact that we've to import the `threading` module during CPython interpreter loading (currently it's not imported by default).This is consequence of the fact that CPython interpreter uses `os` module when bootstrapping. | |||
| msg294444 -(view) | Author: Mark Dickinson (mark.dickinson)*![]() | Date: 2017-05-25 07:32 | |
@osantana: you'd need to be using the lock at least in `__setitem__` and `__delitem__` as well, else there's nothing preventing modifications to the dictionary during the `list` call. It may make sense to protect *all* accesses (read or write) to `_data`, though you'd need to either use a re-entrant lock in that case or be very careful about not acquiring the lock twice. | |||
| msg294445 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-25 07:46 | |
I'm fine with a low-level lock, though as Mark says you could use _thread._RLock(). | |||
| msg294452 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-05-25 10:10 | |
Isn't list(dict) atomic? At least in CPython.I suspect that protecting *all* accesses to _data with a lock will hit the performance and invalidate the reason of using _data for caching. | |||
| msg294453 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-25 10:13 | |
> Isn't list(dict) atomic? At least in CPython.I doubt it. There's no special code for list(dict), it uses regular iteration which could probably hit the GC and then release the GIL.> I suspect that protecting *all* accesses to _data with a lock will hit the performance and invalidate the reason of using _data for caching.I'm not convinced os.environ is performance-critical. Perhaps getenv() is really expensive on some systems? | |||
| msg294456 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-05-25 10:30 | |
> There's no special code for list(dict), it uses regular iteration which could probably hit the GC and then release the GIL.While there are no special code for list(dict), regular iteration over exact dict don't hit the GC. | |||
| msg294457 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-25 10:33 | |
> While there are no special code for list(dict), regular iteration over exact dict don't hit the GC.Doesn't it? Creating a dict iterator creates a GC-tracked object. | |||
| msg294459 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-05-25 10:42 | |
But it is destroyed only after iterating. | |||
| msg294460 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-25 10:45 | |
Every time you create a GC-tracked object, you can invoke the GC. See _PyObject_GC_Alloc. | |||
| msg294467 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-05-25 11:15 | |
Invoking GC before iterating or after iterating doesn't break the atomicity of iterating. | |||
| msg294468 -(view) | Author: Antoine Pitrou (pitrou)*![]() | Date: 2017-05-25 11:16 | |
Perhaps. But this discussion shows the matter is complicated and we shouldn't rely on fragile assumptions about implementation details. So we need a lock. | |||
| msg294498 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-05-25 17:08 | |
New version of patch:1. Use RLock instead of Lock (Mark & Antoine recomendation)2. Add lock acquire at `__setitem__` and `__delitem__` as following (Mark & Antoine recomendation)3. Add lock protection in `__repr__` implementation.I've some questions pending:1. Is it a good idea to wrap self._data access with lock in __getitem__? (I suspect it's not)2. Is it a good idea to lock protect self.copy() and self.setdefault() (I suspect it's not because both uses __iter__ that is already protected) | |||
| msg296122 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-06-15 18:23 | |
See alsoissue24484. The solution depends on the assumption that list(dict) is atomic. | |||
| msg296847 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-06-26 05:11 | |
Osvaldo, could you please create a pull request on GitHub based on your first patch? But use list() instead of dict(). | |||
| msg296904 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-06-26 14:32 | |
Pull Request#2409 (https://github.com/python/cpython/pull/2409) opened. | |||
| msg297486 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-07-01 17:34 | |
New changeset8a8d28501fc8ce25926d168f1c657656c809fd4c by Serhiy Storchaka (Osvaldo Santana Neto) in branch 'master':bpo-30441: Fix bug when modifying os.environ while iterating over it (#2409)https://github.com/python/cpython/commit/8a8d28501fc8ce25926d168f1c657656c809fd4c | |||
| msg297625 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-07-04 04:55 | |
New changesetbebd2cfa5f21811dd0ee4f3b1a1b85d379b83436 by Serhiy Storchaka in branch '3.6':[3.6]bpo-30441: Fix bug when modifying os.environ while iterating over it (GH-2409). (#2556)https://github.com/python/cpython/commit/bebd2cfa5f21811dd0ee4f3b1a1b85d379b83436 | |||
| msg297626 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-07-04 04:55 | |
New changeset1a3bc5546aa27f01426ad76618a9b2c3b698ae68 by Serhiy Storchaka in branch '3.5':[3.5]bpo-30441: Fix bug when modifying os.environ while iterating over it (GH-2409). (#2557)https://github.com/python/cpython/commit/1a3bc5546aa27f01426ad76618a9b2c3b698ae68 | |||
| msg297627 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2017-07-04 05:19 | |
Thank you for your contribution Osvaldo. | |||
| msg297695 -(view) | Author: Osvaldo Santana Neto (osantana)* | Date: 2017-07-05 03:46 | |
Hi Serhiy Storchaka,I need to thank you for your valuable guidance. | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:46 | admin | set | github: 74626 |
| 2017-07-05 03:46:48 | osantana | set | messages: +msg297695 |
| 2017-07-04 05:19:33 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: +msg297627 stage: backport needed -> resolved |
| 2017-07-04 04:55:47 | serhiy.storchaka | set | messages: +msg297626 |
| 2017-07-04 04:55:34 | serhiy.storchaka | set | messages: +msg297625 |
| 2017-07-04 04:24:29 | serhiy.storchaka | set | pull_requests: +pull_request2626 |
| 2017-07-04 04:22:15 | serhiy.storchaka | set | pull_requests: +pull_request2625 |
| 2017-07-01 17:35:39 | serhiy.storchaka | set | stage: backport needed |
| 2017-07-01 17:34:47 | serhiy.storchaka | set | messages: +msg297486 |
| 2017-06-26 14:32:16 | osantana | set | messages: +msg296904 pull_requests: +pull_request2457 |
| 2017-06-26 05:11:49 | serhiy.storchaka | set | messages: +msg296847 |
| 2017-06-15 18:23:24 | serhiy.storchaka | set | messages: +msg296122 |
| 2017-05-25 17:08:24 | osantana | set | files: +fix_os_environ_iter_issue_low_level_thread_lock_2.diff messages: +msg294498 |
| 2017-05-25 11:16:17 | pitrou | set | messages: +msg294468 |
| 2017-05-25 11:15:10 | serhiy.storchaka | set | messages: +msg294467 |
| 2017-05-25 10:45:20 | pitrou | set | messages: +msg294460 |
| 2017-05-25 10:42:09 | serhiy.storchaka | set | messages: +msg294459 |
| 2017-05-25 10:33:17 | pitrou | set | messages: +msg294457 |
| 2017-05-25 10:30:51 | serhiy.storchaka | set | messages: +msg294456 |
| 2017-05-25 10:13:54 | pitrou | set | messages: +msg294453 |
| 2017-05-25 10:10:53 | serhiy.storchaka | set | nosy: +serhiy.storchaka messages: +msg294452 |
| 2017-05-25 07:46:06 | pitrou | set | messages: +msg294445 |
| 2017-05-25 07:32:25 | mark.dickinson | set | nosy: +mark.dickinson messages: +msg294444 |
| 2017-05-24 22:46:12 | osantana | set | files: +fix_os_environ_iter_issue_threading_lock.diff messages: +msg294407 |
| 2017-05-24 22:37:28 | osantana | set | files: +fix_os_environ_iter_issue_low_level_thread_lock.diff messages: +msg294406 |
| 2017-05-24 21:30:23 | pitrou | set | nosy: +pitrou messages: +msg294398 versions: + Python 3.7 |
| 2017-05-24 13:25:57 | osantana | set | messages: +msg294354 |
| 2017-05-24 12:48:08 | martin.panter | set | nosy: +martin.panter messages: +msg294352 |
| 2017-05-23 16:57:19 | Danilo Shiga | set | nosy: +Danilo Shiga |
| 2017-05-23 16:29:06 | osantana | set | messages: +msg294261 |
| 2017-05-23 15:47:53 | JelleZijlstra | set | nosy: +JelleZijlstra messages: +msg294257 |
| 2017-05-23 14:35:50 | osantana | create | |