Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-133546: Makere.Match
a well-roundedSequence
type#133549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
@@ -1378,6 +1378,27 @@ when there is no match, you can test whether there was a match with a simple | |||
if match: | |||
process(match) | |||
Match objects are proper :class:`~collections.abc.Sequence` types. You can access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is not true with this PR, Sequence has a number of other requirements (e.g. anindex
andcount
method).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice catch, thanks! I addedindex
andcount
by adapting the implementation oftuple.index
andtuple.count
. I also updated the unit test to cover allSequence
mixin methods.
Modules/_sre/sre.c Outdated
if (start < 0) { | ||
start += self->groups; | ||
if (start < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You should take a quick look atPEP-7 for our style guide. We try to avoid omitting brackets. There is a bit of inconsistency in this file, but it's good to just follow PEP 7 for new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I read throughPEP-7 already but I didn't really think about it here since I copied this directly fromtupleobject.c
. Let me clean this up!
Uh oh!
There was an error while loading.Please reload this page.
Modules/_sre/sre.c Outdated
if (index < 0 || index >= self->groups) { | ||
/* raise IndexError if we were given a bad group number */ | ||
if (!PyErr_Occurred()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We shouldn't conditionally raise the error.match_item
shouldn't ever be called with an exception set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Fixed!
PyObject* group = match_getslice_by_index(self, i, Py_None); | ||
if (group == NULL) | ||
return NULL; | ||
int cmp = PyObject_RichCompareBool(group, value, Py_EQ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm pretty sure we need to incref the value in case the__bool__
releases a reference. cc@picnixz who fixed a bunch of issues related to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Not sure if this is what you're talking about, but I think this is already handled byPyObject_RichCompareBool
out of the box. It's a wrapper aroundPyObject_RichCompare
that unwraps and decref the returned object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It's not about the returned object, it's about the inputs.PyObject_RichCompareBool
andPyObject_RichCompare
can call arbitrary Python code in general, so if such arbitrary code actually releases or changes their inputs, we may end up with what we call a "use-after-free" (and usually it leads to a SIGSEV, but in other occurrences it can be a security issue).
In order to determine whether there is a UAF or not, you need to check whethergroup
orvalue
can actually be obtained from pure Python code and be changed by the call to__eq__
. In this case,value
can be modified in place by its call, but_sre_SRE_Match_index_impl
should have already held a strong reference on it, so there shouldn't be a UAF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I see, thanks for the explanation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Test coverage is sometimes insufficient and sometimes redundant with existing ones.
Doc/library/re.rst Outdated
@@ -1473,6 +1494,37 @@ when there is no match, you can test whether there was a match with a simple | |||
.. versionadded:: 3.6 | |||
.. versionchanged:: 3.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
..versionchanged::3.14 | |
..versionchanged::next |
Doc/library/re.rst Outdated
>>> len(m) | ||
3 | ||
.. versionadded:: 3.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
..versionadded::3.14 | |
..versionadded::next |
Doc/library/re.rst Outdated
Raises :exc:`ValueError` if the value is not present. | ||
.. versionadded:: 3.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
..versionadded::3.14 | |
..versionadded::next |
Doc/library/re.rst Outdated
Return the number of occurrences of the value among the matched groups. | ||
.. versionadded:: 3.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
..versionadded::3.14 | |
..versionadded::next |
Uh oh!
There was an error while loading.Please reload this page.
self.assertEqual(m[1:-1], ("a", "b")) | ||
self.assertEqual(m[::-1], ("c", "b", "a", "abc")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's check more slices.
self.assertEqual(m.count("a"), 1) | ||
self.assertEqual(m.count("b"), 1) | ||
self.assertEqual(m.count("c"), 1) | ||
self.assertEqual(m.count("123"), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's check with non-string objects.
Lib/test/test_re.py Outdated
case [_, "a", "b", "c"]: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's catch the first group (the entire match) and assert it.
Lib/test/test_re.py Outdated
self.fail() | ||
match re.match(r"(\d+)-(\d+)-(\d+)", "2025-05-07"): | ||
case [_, year, month, day]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's do the same here.
Lib/test/test_re.py Outdated
@@ -599,8 +603,83 @@ def test_match_getitem(self): | |||
with self.assertRaises(TypeError): | |||
m[0] = 1 | |||
# No len(). | |||
self.assertRaises(TypeError, len, m) | |||
def test_match_sequence(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's split this into multiple test:_index
,_count
and_iter
, as well asmatch case
. Forgetitem
, let's move them undertest_match_getitem
as well and reduce the number of duplicated checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Some additional feedback. For the test failures, let's add someassertWarns
. I don't think we should make some additional checks in C for the type of the operands (the issue is when you compare the input (which is a bytes) with the matched item (which are strings), hence a BytesWarning). So, instead, let's just expect the warning.
By the way, could you avoid using@picnixz
in your commits as I'm getting pinged? TiA.
Lib/test/test_re.py Outdated
def test_match_sequence(self): | ||
from collections.abc import Sequence | ||
# Slices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Slices can be part of a new test if you want.
Modules/_sre/sre.c Outdated
@@ -2491,7 +2491,10 @@ match_subscript(PyObject *op, PyObject* item) | |||
if (self->pattern->groupindex) { | |||
PyObject* index = PyDict_GetItemWithError(self->pattern->groupindex, item); | |||
if (index && PyLong_Check(index)) { | |||
return match_item(op, PyLong_AsSsize_t(index)); | |||
Py_ssize_t i = PyLong_AsSsize_t(index); | |||
if (i != -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
-1 is strictly speaking a valid one, so you should usePyErr_Occurred()
Modules/_sre/sre.c Outdated
@@ -2750,12 +2741,12 @@ _sre_SRE_Match_count_impl(MatchObject *self, PyObject *value) | |||
} | |||
int cmp = PyObject_RichCompareBool(group, value, Py_EQ); | |||
Py_DECREF(group); | |||
if (cmp < 0) { | |||
return NULL; | |||
} | |||
if (cmp > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Now you can use anelse if
here
Oh, sorry for the ping. Is it okay if I just leave out the cases with bytes as input? I can't seem to trigger the warning locally, even with |
Uh oh!
There was an error while loading.Please reload this page.
re.Match
a well-roundedSequence
type #133546📚 Documentation preview 📚:https://cpython-previews--133549.org.readthedocs.build/