Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.2k
gh-145678: Fix use-after-free in itertools.groupby _grouper_next()#145679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Changes fromall commits
File filter
Filter by extension
Conversations
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -754,6 +754,40 @@ def keys(): | ||||||||||||
| next(g) | ||||||||||||
| next(g) # must pass with address sanitizer | ||||||||||||
| def test_grouper_next_reentrant_eq_does_not_crash(self): | ||||||||||||
| # regression test for gh-145678: _grouper_next() did not protect | ||||||||||||
| # gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool, | ||||||||||||
| # so a reentrant __eq__ that advanced the parent groupby could free | ||||||||||||
| # those objects while they were still being compared (use-after-free). | ||||||||||||
Comment on lines +758 to +761 Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. The reference is enough. Suggested change
| ||||||||||||
| outer_grouper = None | ||||||||||||
| class Key: | ||||||||||||
| def __init__(self, val, do_advance): | ||||||||||||
| self.val = val | ||||||||||||
| self.do_advance = do_advance | ||||||||||||
| def __eq__(self, other): | ||||||||||||
| if self.do_advance: | ||||||||||||
| self.do_advance = False | ||||||||||||
| # Advance the parent groupby iterator from inside __eq__, | ||||||||||||
| # which calls groupby_step() and frees the old currkey. | ||||||||||||
| try: | ||||||||||||
| next(outer_grouper) | ||||||||||||
| except StopIteration: | ||||||||||||
| pass | ||||||||||||
Comment on lines +772 to +777 Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. The Lose the comment. | ||||||||||||
| return NotImplemented | ||||||||||||
| return self.val == other.val | ||||||||||||
| def __hash__(self): | ||||||||||||
| return hash(self.val) | ||||||||||||
| values = [1, 1, 2] | ||||||||||||
| keys_iter = iter([Key(1, True), Key(1, False), Key(2, False)]) | ||||||||||||
| g = itertools.groupby(values, lambda _: next(keys_iter)) | ||||||||||||
| outer_grouper = g | ||||||||||||
Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. Just call it | ||||||||||||
| k, grp = next(g) | ||||||||||||
| list(grp) # must not crash with address sanitizer | ||||||||||||
| def test_filter(self): | ||||||||||||
| self.assertEqual(list(filter(isEven, range(6))), [0,2,4]) | ||||||||||||
| self.assertEqual(list(filter(None, [0,1,0,2,0])), [1,2]) | ||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,7 @@ | ||||||||||||||||||||||
| Fix a use-after-free in:func:`itertools.groupby`: the internal | ||||||||||||||||||||||
| ``_grouper_next()`` function did not hold strong references to | ||||||||||||||||||||||
| ``gbo->currkey`` and ``igo->tgtkey`` before calling | ||||||||||||||||||||||
| :c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced | ||||||||||||||||||||||
| the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on | ||||||||||||||||||||||
| ``currkey``) could free those objects while they were still being compared. | ||||||||||||||||||||||
| The fix mirrors the protection added in:gh:`143543` for ``groupby_next()``. | ||||||||||||||||||||||
Comment on lines +1 to +7 Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. Suggested change
| ||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -678,7 +678,17 @@ _grouper_next(PyObject *op) | ||||||||||||||||||||||
| } | ||||||||||||||||||||||
| assert(gbo->currkey != NULL); | ||||||||||||||||||||||
| /* A user-defined __eq__ can re-enter groupby (via the parent iterator) | ||||||||||||||||||||||
| and call groupby_step(), which frees gbo->currkey via Py_XSETREF while | ||||||||||||||||||||||
| we are still comparing it. Take local snapshots with strong references | ||||||||||||||||||||||
| so INCREF/DECREF apply to the same objects even under re-entrancy. */ | ||||||||||||||||||||||
| PyObject *tgtkey = igo->tgtkey; | ||||||||||||||||||||||
| PyObject *currkey = gbo->currkey; | ||||||||||||||||||||||
| Py_INCREF(tgtkey); | ||||||||||||||||||||||
| Py_INCREF(currkey); | ||||||||||||||||||||||
Comment on lines +681 to +688 Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. I don't think we need to record historical issues in the source. General reasons for holding refs while calling user code are clear -- the trick is finding cases where this was forgotten. Suggested change
(cc@picnixz, reviewer of the earlier PR) Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more.
Victor sometimes asks me to do that, while Serhiy usually prefers not to, and I personally avoid but I did so in the past so I'm not that consistent either. I think I did add one such comment in OrderedDict so I don't mind. +1 for Py_NewRef, it slipped through my review my bad. Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. IMO, comments are good when it's not clear why the code does something. Here, it's relatively clear that we're calling user code. The exact way user code can clear references isn'tthat important -- and may change in the future.
| ||||||||||||||||||||||
| rcmp = PyObject_RichCompareBool(tgtkey, currkey, Py_EQ); | ||||||||||||||||||||||
| Py_DECREF(tgtkey); | ||||||||||||||||||||||
| Py_DECREF(currkey); | ||||||||||||||||||||||
| if (rcmp <= 0) | ||||||||||||||||||||||
| /* got any error or current group is end */ | ||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||
Uh oh!
There was an error while loading.Please reload this page.