Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
Description
Feature or enhancement
Proposal:
It would be nice if the following worked as expected:
m=re.match(r"(a)(b)(c)","abc")assertisinstance(m,Sequence)assertlen(m)==4assertlist(m)== ["abc","a","b","c"]abc,a,b,c=massertabc=="abc"anda=="a"andb=="b"andc=="c"matchre.match(r"(\d+)-(\d+)-(\d+)","2025-05-07"):case [_,year,month,day]:assertyear=="2025"andmonth=="05"andday=="07"
If you also work with Javascript this will feel very familiar:
letm="abc".match(/(a)(b)(c)/)console.log(minstanceofArray)// trueconsole.log(m.length)// 4console.log(Array.from(m))// [ 'abc', 'a', 'b', 'c' ]let[abc,a,b,c]=mconsole.log(abc)// abcconsole.log(a)// aconsole.log(b)// bconsole.log(c)// c
Back in 2016, there.Match
object API was expanded to include__getitem__
as a shortcut for.group(...)
.
The goal was to improve usability and approachability by makingre.Match
objects fit a bit more seamlessly into python's core data model. Accessing groups via subscripting is now intuitive, but becausere.Match
objects only have a__getitem__
and no__len__
, they can't be used as a properSequence
type.
To me, this always felt a bit awkward. After digging up theoriginal discussion, it seems like the reason why__len__
didn't make it was that it was still undecided whether the returned value should take into account group0
or not.
Almost a decade later, as a user, the way I see it is that the__getitem__
implementation we're now used to suggests a regularSequence
type that also happens to transparently translate group names provided as subscript to their corresponding group index. In fact, this is actually how it works in the underlying C code.
With this in mind, we can simply define__len__
taking into account group0
, and we'll finally be able to enjoy coherentre.Match
objects that behave as properSequence
types.
Has this already been discussed elsewhere?
https://discuss.python.org/t/make-re-match-a-well-rounded-sequence-type/91039
Links to previous discussion of this feature:
Improve the usability of the match object named group API (605bdae)