Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 642 – Explicit Pattern Syntax for Structural Pattern Matching

Author:
Alyssa Coghlan <ncoghlan at gmail.com>
BDFL-Delegate:

Discussions-To:
Python-Dev list
Status:
Rejected
Type:
Standards Track
Requires:
634
Created:
26-Sep-2020
Python-Version:
3.10
Post-History:
31-Oct-2020, 08-Nov-2020, 03-Jan-2021
Resolution:
Python-Dev message

Table of Contents

Abstract

This PEP covers an alternative syntax proposal forPEP 634’s structural patternmatching that requires explicit prefixes on all capture patterns and valueconstraints. It also proposes a new dedicated syntax for instance attributepatterns that aligns more closely with the proposed mapping pattern syntax.

While the result is necessarily more verbose than the proposed syntax inPEP 634, it is still significantly less verbose than the status quo.

As an example, the following match statement would extract “host” and “port”details from a 2 item sequence, a mapping with “host” and “port” keys, anyobject with “host” and “port” attributes, or a “host:port” string, treatingthe “port” as optional in the latter three cases:

port=DEFAULT_PORTmatchexpr:case[ashost,asport]:passcase{"host"ashost,"port"asport}:passcase{"host"ashost}:passcaseobject{.hostashost,.portasport}:passcaseobject{.hostashost}:passcasestr{}asaddr:host,__,optional_port=addr.partition(":")ifoptional_port:port=optional_portcase__asm:raiseTypeError(f"Unknown address format:{m!r:.200}")port=int(port)

At a high level, this PEP proposes to categorise the different available patterntypes as follows:

  • wildcard pattern:__
  • group patterns:(PTRN)
  • value constraint patterns:
    • equality constraints:==EXPR
    • identity constraints:isEXPR
  • structural constraint patterns:
    • sequence constraint patterns:[PTRN,asNAME,PTRNasNAME]
    • mapping constraint patterns:{EXPR:PTRN,EXPRasNAME}
    • instance attribute constraint patterns:CLS{.NAME,.NAME:PTRN,.NAME==EXPR,.NAMEasNAME}
    • class defined constraint patterns:CLS(PTRN,PTRN,**{.NAME,.NAME:PTRN,.NAME==EXPR,.NAMEasNAME})
  • OR patterns:PTRN|PTRN|PTRN
  • AS patterns:PTRNasNAME (omitting the pattern implies__)

The intent of this approach is to:

  • allow an initial form of pattern matching to be developed and released withoutneeding to decide up front on the best default options for handling bare names,attribute lookups, and literal values
  • ensure that pattern matching is defined explicitly at the Abstract Syntax Treelevel, allowing the specifications of the semantics and the surface syntax forpattern matching to be clearly separated
  • define a clear and concise “ducktyping” syntax that could potentially beadopted in ordinary expressions as a way to more easily retrieve a tuplecontaining multiple attributes from the same object

Relative toPEP 634, the proposal also deliberately eliminates any syntax that“binds to the right” without using theas keyword (using capture patternsinPEP 634’s mapping patterns and class patterns) or binds to both the left andthe right in the same pattern (usingPEP 634’s capture patterns with AS patterns)

Relationship with other PEPs

This PEP both depends on and competes withPEP 634 - the PEP author agrees thatmatch statements would be a sufficiently valuable addition to the language tobe worth the additional complexity that they add to the learning process, butdisagrees with the idea that “simple name vs literal or attribute lookup”really offers an adequate syntactic distinction between name binding and valuelookup operations in match patterns (at least for Python).

This PEP agrees with the spirit ofPEP 640 (that the chosen wildcard pattern toskip a name binding should be supported everywhere, not just in match patterns),but is now proposing a different spelling for the wildcard syntax (__ ratherthan?). As such, it competes withPEP 640 as written, but would complementa proposal to deprecate the use of__ as an ordinary identifier and insteadturn it into a general purpose wildcard marker that always skips making a newlocal variable binding.

While it has not yet been put forward as a PEP, Mark Shannon has a pre-PEP draft[8] expressing several concerns about the runtime semantics of the patternmatching proposal inPEP 634. This PEP is somewhat complementary to that one, aseven though this PEP is mostly about surface syntax changes rather than majorsemantic changes, it does propose that the Abstract Syntax Tree definition bemade more explicit to better separate the details of the surface syntax from thesemantics of the code generation step. There is one specific idea in that pre-PEPdraft that this PEP explicitly rejects: the idea that the different kinds ofmatching are mutually exclusive. It’s entirely possible for the same value tomatch different kinds of structural pattern, and which one takes precedence willintentionally be governed by the order of the cases in the match statement.

Motivation

The originalPEP 622 (which was later split intoPEP 634,PEP 635, andPEP 636)incorporated an unstated but essential assumption in its syntax design: thatneither ordinary expressionsnor the existing assignment target syntax providean adequate foundation for the syntax used in match patterns.

While the PEP didn’t explicitly state this assumption, one of the PEP authorsexplained it clearly on python-dev[1]:

The actual problem that I see is that we have different cultures/intuitionsfundamentally clashing here. In particular, so many programmers welcomepattern matching as an “extended switch statement” and find it thereforestrange that names are binding and not expressions for comparison. Othersargue that it is at odds with current assignment statements, say, andquestion why dotted names are _/not/_ binding. What all groups seem tohave in common, though, is that they refer to _/their/_ understanding andinterpretation of the new match statement as ‘consistent’ or ‘intuitive’— naturally pointing out where we as PEP authors went wrong with ourdesign.

But here is the catch: at least in the Python world, pattern matching asproposed by this PEP is an unprecedented and new way of approaching a commonproblem. It is not simply an extension of something already there. Evenworse: while designing the PEP we found that no matter from which angle youapproach it, you will run into issues of seeming ‘inconsistencies’ (which isto say that pattern matching cannot be reduced to a ‘linear’ extension ofexisting features in a meaningful way): there is always something that goesfundamentally beyond what is already there in Python. That’s why I arguethat arguments based on what is ‘intuitive’ or ‘consistent’ just do notmake sense _/in this case/_.

The first iteration of this PEP was then born out of an attempt to show that thesecond assertion was not accurate, and that match patterns could be treatedas a variation on assignment targets without leading to inherent contradictions.(An earlier PR submitted to list this option in the “Rejected Ideas” sectionof the originalPEP 622 had previously been declined[2]).

However, the review process for this PEP strongly suggested that not only didthe contradictions that Tobias mentioned in his email exist, but they were alsoconcerning enough to cast doubts on the syntax proposal presented inPEP 634.Accordingly, this PEP was changed to go even further thanPEP 634, and largelyabandon alignment between the sequence matching syntax and the existing iterableunpacking syntax (effectively answering “Not really, as least as far as theexact syntax is concerned” to the first question raised in the DLS’20 paper[9]: “Can we extend a feature like iterable unpacking to work for more generalobject and data layouts?”).

This resulted in a complete reversal of the goals of the PEP: rather thanattempting to emphasise the similarities between assignment and pattern matching,the PEP now attempts to make sure that assignment target syntax isn’t beingreusedat all, reducing the likelihood of incorrect inferences being drawnabout the new construct based on experience with existing ones.

Finally, before completing the 3rd iteration of the proposal (which droppedinferred patterns entirely), the PEP author spent quite a bit of time reflectingon the following entries inPEP 20:

  • Explicit is better than implicit.
  • Special cases aren’t special enough to break the rules.
  • In the face of ambiguity, refuse the temptation to guess.

If we start with an explicit syntax, we can always add syntactic shortcuts later(e.g. consider the recent proposals to add shortcuts forUnion andOptional type hints only after years of experience with the original moreverbose forms), while if we start out with only the abbreviated forms,then we don’t have any real way to revisit those decisions in a future release.

Specification

This PEP retains the overallmatch/case statement structure and semanticsfromPEP 634, but proposes multiple changes that mean that user intent isexplicitly specified in the concrete syntax rather than needing to be inferredfrom the pattern matching context.

In the proposed Abstract Syntax Tree, the semantics are also always explicit,with no inference required.

The Match Statement

Surface syntax:

match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENTsubject_expr:    | star_named_expression ',' star_named_expressions?    | named_expressioncase_block: "case" (guarded_pattern | open_pattern) ':' blockguarded_pattern: closed_pattern 'if' named_expressionopen_pattern:    | as_pattern    | or_patternclosed_pattern:    | wildcard_pattern    | group_pattern    | structural_constraint

Abstract syntax:

Match(expr subject, match_case* cases)match_case = (pattern pattern, expr? guard, stmt* body)

The rulesstar_named_expression,star_named_expressions,named_expression andblock are part of thestandard Pythongrammar.

Open patterns are patterns which consist of multiple tokens, and aren’tnecessarily terminated by a closing delimiter (for example,__asx,int()|bool()). To avoid ambiguity for human readers, their usage isrestricted to top level patterns and to group patterns (which are patternssurrounded by parentheses).

Closed patterns are patterns which either consist of a single token(i.e.__), or else have a closing delimiter as a required part of theirsyntax (e.g.[asx,asy],object{.xasx,.yasy}).

As inPEP 634, thematch andcase keywords are soft keywords, i.e. theyare not reserved words in other grammatical contexts (including at thestart of a line if there is no colon where expected). This meansthat they are recognized as keywords when part of a matchstatement or case block only, and are allowed to be used in allother contexts as variable or argument names.

UnlikePEP 634, patterns are explicitly defined as a new kind of node in theabstract syntax tree - even when surface syntax is shared with existingexpression nodes, a distinct abstract node is emitted by the parser.

For context,match_stmt is a new alternative forcompound_statement in the surface syntax andMatch is a newalternative forstmt in the abstract syntax.

Match Semantics

This PEP largely retains the overall pattern matching semantics proposed inPEP 634.

The proposed syntax for patterns changes significantly, and is discussed indetail below.

There are also some proposed changes to the semantics of class definedconstraints (class patterns inPEP 634) to eliminate the need to special caseany builtin types (instead, the introduction of dedicated syntax for instanceattribute constraints allows the behaviour needed by those builtin types to bespecified as applying to any type that sets__match_args__ toNone)

Guards

This PEP retains the guard clause semantics proposed inPEP 634.

However, the syntax is changed slightly to require that when a guard clauseis present, the case pattern must be aclosed pattern.

This makes it clearer to the reader where the pattern ends and the guard clausebegins. (This is mainly a potential problem with OR patterns, where the guardclause looks kind of like the start of a conditional expression in the finalpattern. Actually doing that isn’t legal syntax, so there’s no ambiguity as faras the compiler is concerned, but the distinction may not be as clear to a humanreader)

Irrefutable case blocks

The definition of irrefutable case blocks changes slightly in this PEP relativetoPEP 634, as capture patterns no longer exist as a separate concept fromAS patterns.

Aside from that caveat, the handling of irrefutable cases is the same as inPEP 634:

  • wildcard patterns are irrefutable
  • AS patterns whose left-hand side is irrefutable
  • OR patterns containing at least one irrefutable pattern
  • parenthesized irrefutable patterns
  • a case block is considered irrefutable if it has no guard and itspattern is irrefutable.
  • a match statement may have at most one irrefutable case block, and itmust be last.

Patterns

The top-level surface syntax for patterns is as follows:

open_pattern:# Pattern may use multiple tokens with no closing delimiter|as_pattern|or_patternas_pattern:[closed_pattern]pattern_as_clauseor_pattern:'|'.simple_pattern+simple_pattern:# Subnode where "as" and "or" patterns must be parenthesised|closed_pattern|value_constraintclosed_pattern:# Require a single token or a closing delimiter in pattern|wildcard_pattern|group_pattern|structural_constraint

As described above, the usage of open patterns is limited to top level caseclauses and when parenthesised in a group pattern.

The abstract syntax for patterns explicitly indicates which elements aresubpatterns and which elements are subexpressions or identifiers:

pattern = MatchAlways     | MatchValue(matchop op, expr value)     | MatchSequence(pattern* patterns)     | MatchMapping(expr* keys, pattern* patterns)     | MatchAttrs(expr cls, identifier* attrs, pattern* patterns)     | MatchClass(expr cls, pattern* patterns, identifier* extra_attrs, pattern* extra_patterns)     | MatchRestOfSequence(identifier? target)     -- A NULL entry in the MatchMapping key list handles capturing extra mapping keys     | MatchAs(pattern? pattern, identifier target)     | MatchOr(pattern* patterns)

AS Patterns

Surface syntax:

as_pattern: [closed_pattern] pattern_as_clausepattern_as_clause: 'as' pattern_capture_targetpattern_capture_target: !"__" NAME !('.' | '(' | '=')

(Note: the name on the right may not be__.)

Abstract syntax:

MatchAs(pattern? pattern, identifier target)

An AS pattern matches the closed pattern on the left of theaskeyword against the subject. If this fails, the AS pattern fails.Otherwise, the AS pattern binds the subject to the name on the rightof theas keyword and succeeds.

If no pattern to match is given, the wildcard pattern (__) is implied.

To avoid confusion with thewildcard pattern, the double underscore (__)is not permitted as a capture target (this is what!"__" expresses).

A capture pattern always succeeds. It binds the subject value to thename using the scoping rules for name binding established for named expressionsinPEP 572. (Summary: the name becomes a localvariable in the closest containing function scope unless there’s anapplicablenonlocal orglobal statement.)

In a given pattern, a given name may be bound only once. Thisdisallows for examplecase[asx,asx]:... but allowscase[asx]|(asx):

As an open pattern, the usage of AS patterns is limited to top level caseclauses and when parenthesised in a group pattern. However, several of thestructural constraints allow the use ofpattern_as_clause in relevantlocations to bind extracted elements of the matched subject to local variables.These are mostly represented in the abstract syntax tree asMatchAs nodes,aside from the dedicatedMatchRestOfSequence node in sequence patterns.

OR Patterns

Surface syntax:

or_pattern:'|'.simple_pattern+simple_pattern:# Subnode where "as" and "or" patterns must be parenthesised|closed_pattern|value_constraint

Abstract syntax:

MatchOr(pattern*patterns)

When two or more patterns are separated by vertical bars (|),this is called an OR pattern. (A single simple pattern is just that)

Only the final subpattern may be irrefutable.

Each subpattern must bind the same set of names.

An OR pattern matches each of its subpatterns in turn to the subject,until one succeeds. The OR pattern is then deemed to succeed.If none of the subpatterns succeed the OR pattern fails.

Subpatterns are mostly required to be closed patterns, but the parentheses maybe omitted for value constraints.

Value constraints

Surface syntax:

value_constraint:|eq_constraint|id_constrainteq_constraint:'=='closed_exprid_constraint:'is'closed_exprclosed_expr:# Require a single token or a closing delimiter in expression|primary|closed_factorclosed_factor:# "factor" is the main grammar node for these unary ops|'+'primary|'-'primary|'~'primary

Abstract syntax:

MatchValue(matchopop,exprvalue)matchop=EqCheck|IdCheck

The ruleprimary is defined in the standard Python grammar, and onlyallows expressions that either consist of a single token, or else are requiredto end with a closing delimiter.

Value constraints replacePEP 634’s literal patterns and value patterns.

Equality constraints are written as==EXPR, while identity constraints arewritten asisEXPR.

An equality constraint succeeds if the subject value compares equal to thevalue given on the right, while an identity constraint succeeds only if they arethe exact same object.

The expressions to be compared against are largely restricted to eithersingle tokens (e.g. names, strings, numbers, builtin constants), or else toexpressions that are required to end with a closing delimiter.

The use of the high precedence unary operators is also permitted, as the risk ofperceived ambiguity is low, and being able to specify negative numbers withoutparentheses is desirable.

When the same constraint expression occurs multiple times in the same matchstatement, the interpreter may cache the first value calculated and reuse it,rather than repeat the expression evaluation. (As forPEP 634 value patterns,this cache is strictly tied to a given execution of a given match statement.)

Unlike literal patterns inPEP 634, this PEP requires that complexliterals be parenthesised to be accepted by the parser. See the DeferredIdeas section for discussion on that point.

If this PEP were to be adopted in preference toPEP 634, then all literal andvalue patterns would instead be written more explicitly as value constraints:

# Literal patternsmatchnumber:case==0:print("Nothing")case==1:print("Just one")case==2:print("A couple")case==-1:print("One less than nothing")case==(1-1j):print("Good luck with that...")# Additional literal patternsmatchvalue:case==True:print("True or 1")case==False:print("False or 0")case==None:print("None")case=="Hello":print("Text 'Hello'")case==b"World!":print("Binary 'World!'")# Matching by identity rather than equalitySENTINEL=object()matchvalue:caseisTrue:print("True, not 1")caseisFalse:print("False, not 0")caseisNone:print("None, following PEP 8 comparison guidelines")caseis...:print("May be useful when writing __getitem__ methods?")caseisSENTINEL:print("Matches the sentinel by identity, not just value")# Matching against variables and attributesfromenumimportEnumclassSides(str,Enum):SPAM="Spam"EGGS="eggs"...preferred_side=Sides.EGGSmatchentree[-1]:case==Sides.SPAM:# Compares entree[-1] == Sides.SPAM.response="Have you got anything without Spam?"case==preferred_side:# Compares entree[-1] == preferred_sideresponse=f"Oh, I love{preferred_side}!"caseasside:# Assigns side = entree[-1].response=f"Well, could I have their Spam instead of the{side} then?"

Note the==preferred_side example: using an explicit prefix marker onconstraint expressions removes the restriction to only working with attributesor literals for value lookups.

The==(1-1j) example illustrates the use of parentheses to turn anysubexpression into a closed one.

Wildcard Pattern

Surface syntax:

wildcard_pattern:"__"

Abstract syntax:

MatchAlways

A wildcard pattern always succeeds. As inPEP 634, it binds no name.

WherePEP 634 chooses the single underscore as its wildcard pattern forconsistency with other languages, this PEP chooses the double underscore as thathas a clearer path towards potentially being made consistent across the entirelanguage, whereas that path is blocked for"_" by i18n related use cases.

Example usage:

matchsequence:case[__]:# any sequence with a single elementreturnTruecase[start,*__,end]:# a sequence with at least two elementsreturnstart==endcase__:# anythingreturnFalse

Group Patterns

Surface syntax:

group_pattern:'('open_pattern')'

For the syntax ofopen_pattern, see Patterns above.

A parenthesized pattern has no additional syntax and is not represented in theabstract syntax tree. It allows users to add parentheses around patterns toemphasize the intended grouping, and to allow nesting of open patterns when thegrammar requires a closed pattern.

UnlikePEP 634, there is no potential ambiguity with sequence patterns, asthis PEP requires that all sequence patterns be written with square brackets.

Structural constraints

Surface syntax:

structural_constraint:|sequence_constraint|mapping_constraint|attrs_constraint|class_constraint

Note: the separate “structural constraint” subcategory isn’t used in theabstract syntax tree, it’s merely used as a convenient grouping node in thesurface syntax definition.

Structural constraints are patterns used to both make assertions about complexobjects and to extract values from them.

These patterns may all bind multiple values, either through the use of nestedAS patterns, or else through the use ofpattern_as_clause elements includedin the definition of the pattern.

Sequence constraints

Surface syntax:

sequence_constraint: '[' [sequence_constraint_elements] ']'sequence_constraint_elements: ','.sequence_constraint_element+ ','?sequence_constraint_element:    | star_pattern    | simple_pattern    | pattern_as_clausestar_pattern: '*' (pattern_as_clause | wildcard_pattern)simple_pattern: # Subnode where "as" and "or" patterns must be parenthesised    | closed_pattern    | value_constraintpattern_as_clause: 'as' pattern_capture_target

Abstract syntax:

MatchSequence(pattern* patterns)MatchRestOfSequence(identifier? target)

Sequence constraints allow items within a sequence to be checked andoptionally extracted.

A sequence pattern fails if the subject value is not an instance ofcollections.abc.Sequence. It also fails if the subject value isan instance ofstr,bytes orbytearray (see Deferred Ideas fora discussion on potentially removing the need for this special casing).

A sequence pattern may contain at most one star subpattern. The starsubpattern may occur in any position and is represented in the AST using theMatchRestOfSequence node.

If no star subpattern is present, the sequence pattern is a fixed-lengthsequence pattern; otherwise it is a variable-length sequence pattern.

A fixed-length sequence pattern fails if the length of the subjectsequence is not equal to the number of subpatterns.

A variable-length sequence pattern fails if the length of the subjectsequence is less than the number of non-star subpatterns.

The length of the subject sequence is obtained using the builtinlen() function (i.e., via the__len__ protocol). However, theinterpreter may cache this value in a similar manner as described forvalue constraint expressions.

A fixed-length sequence pattern matches the subpatterns tocorresponding items of the subject sequence, from left to right.Matching stops (with a failure) as soon as a subpattern fails. If allsubpatterns succeed in matching their corresponding item, the sequencepattern succeeds.

A variable-length sequence pattern first matches the leading non-starsubpatterns to the corresponding items of the subject sequence, as fora fixed-length sequence. If this succeeds, the star subpatternmatches a list formed of the remaining subject items, with itemsremoved from the end corresponding to the non-star subpatternsfollowing the star subpattern. The remaining non-star subpatterns arethen matched to the corresponding subject items, as for a fixed-lengthsequence.

Subpatterns are mostly required to be closed patterns, but the parentheses maybe omitted for value constraints. Sequence elements may also be capturedunconditionally without parentheses.

Note: wherePEP 634 allows all the same syntactic flexibility as iterableunpacking in assignment statements, this PEP restricts sequence patternsspecifically to the square bracket form. Given that the open and parenthesisedforms are far more popular than square brackets for iterable unpacking, thishelps emphasise that iterable unpacking and sequence matching arenot thesame operation. It also avoids the parenthesised form’s ambiguity problembetween single element sequence patterns and group patterns.

Mapping constraints

Surface syntax:

mapping_constraint: '{' [mapping_constraint_elements] '}'mapping_constraint_elements: ','.key_value_constraint+ ','?key_value_constraint:    | closed_expr pattern_as_clause    | closed_expr ':' simple_pattern    | double_star_capturedouble_star_capture: '**' pattern_as_clause

(Note that**__ is deliberately disallowed by this syntax, as additionalmapping entries are ignored by default)

closed_expr is defined above, under value constraints.

Abstract syntax:

MatchMapping(expr*keys,pattern*patterns)

Mapping constraints allow keys and values within a sequence to be checked andvalues to optionally be extracted.

A mapping pattern fails if the subject value is not an instance ofcollections.abc.Mapping.

A mapping pattern succeeds if every key given in the mapping patternis present in the subject mapping, and the pattern foreach key matches the corresponding item of the subject mapping.

The presence of keys is checked using the two argument form of thegetmethod and a unique sentinel value, which offers the following benefits:

  • no exceptions need to be created in the lookup process
  • mappings that implement__missing__ (such ascollections.defaultdict)only match on keys that they already contain, they don’t implicitly add keys

A mapping pattern may not contain duplicate key values. If duplicate keys aredetected when checking the mapping pattern, the pattern is considered invalid,and aValueError is raised. While it would theoretically be possible tochecked for duplicated constant keys at compile time, no such check is currentlydefined or implemented.

(Note: This semantic description is derived from thePEP 634 referenceimplementation, which differs from thePEP 634 specification text at time ofwriting. The implementation seems reasonable, so amending the PEP text seemslike the best way to resolve the discrepancy)

If a'**'asNAME double star pattern is present, that name is bound to adict containing any remaining key-value pairs from the subject mapping(the dict will be empty if there are no additional key-value pairs).

A mapping pattern may contain at most one double star pattern,and it must be last.

Value subpatterns are mostly required to be closed patterns, but the parenthesesmay be omitted for value constraints (the: key/value separator is stillrequired to ensure the entry doesn’t look like an ordinary comparison operation).

Mapping values may also be captured unconditionally using theKEYasNAMEform, without either parentheses or the: key/value separator.

Instance attribute constraints

Surface syntax:

attrs_constraint:    | name_or_attr '{' [attrs_constraint_elements] '}'attrs_constraint_elements: ','.attr_value_pattern+ ','?attr_value_pattern:    | '.' NAME pattern_as_clause    | '.' NAME value_constraint    | '.' NAME ':' simple_pattern    | '.' NAME

Abstract syntax:

MatchAttrs(exprcls,identifier*attrs,pattern*patterns)

Instance attribute constraints allow an instance’s type to be checked andattributes to optionally be extracted.

An instance attribute constraint may not repeat the same attribute name multipletimes. Attempting to do so will result in a syntax error.

An instance attribute pattern fails if the subject is not an instance ofname_or_attr. This is tested usingisinstance().

Ifname_or_attr is not an instance of the builtintype,TypeError is raised.

If no attribute subpatterns are present, the constraint succeeds if theisinstance() check succeeds. Otherwise:

  • Each given attribute name is looked up as an attribute on the subject.
    • If this raises an exception other thanAttributeError,the exception bubbles up.
    • If this raisesAttributeError the constraint fails.
    • Otherwise, the subpattern associated with the keyword is matchedagainst the attribute value. If no subpattern is specified, the wildcardpattern is assumed. If this fails, the constraint fails.If it succeeds, the match proceeds to the next attribute.
  • If all attribute subpatterns succeed, the constraint as a whole succeeds.

Instance attribute constraints allow ducktyping checks to be implemented byusingobject as the required instance type (e.g.caseobject{.hostashost,.portasport}:).

The syntax being proposed here could potentially also be used as the basis fora new syntax for retrieving multiple attributes from an object instance in oneassignment statement (e.g.host,port=addr{.host,.port}). See theDeferred Ideas section for further discussion of this point.

Class defined constraints

Surface syntax:

class_constraint:    | name_or_attr '(' ')'    | name_or_attr '(' positional_patterns ','? ')'    | name_or_attr '(' class_constraint_attrs ')'    | name_or_attr '(' positional_patterns ',' class_constraint_attrs] ')'positional_patterns: ','.positional_pattern+positional_pattern:    | simple_pattern    | pattern_as_clauseclass_constraint_attrs:    | '**' '{' [attrs_constraint_elements] '}'

Abstract syntax:

MatchClass(exprcls,pattern*patterns,identifier*extra_attrs,pattern*extra_patterns)

Class defined constraints allow a sequence of common attributes to bespecified on a class and checked positionally, rather than needing to specifythe attribute names in every related match pattern.

As for instance attribute patterns:

  • a class defined pattern fails if the subject is not an instance ofname_or_attr. This is tested usingisinstance().
  • ifname_or_attr is not an instance of the builtintype,TypeError is raised.

Regardless of whether or not any arguments are present, the subject is checkedfor a__match_args__ attribute using the equivalent ofgetattr(cls,"__match_args__",_SENTINEL)).

If this raises an exception the exception bubbles up.

If the returned value is not a list, tuple, orNone, the conversion failsandTypeError is raised at runtime.

This means that only types that actually define__match_args__ will beusable in class defined patterns. Types that don’t define__match_args__will still be usable in instance attribute patterns.

If__match_args__ isNone, then only a single positional subpattern ispermitted. Attempting to specify additional attribute patterns eitherpositionally or using the double star syntax will causeTypeError to beraised at runtime.

This positional subpattern is then matched against the entire subject, allowinga type check to be combined with another match pattern (e.g. checking boththe type and contents of a container, or the type and value of a number).

If__match_args__ is a list or tuple, then the class defined constraint isconverted to an instance attributes constraint as follows:

  • if only the double star attribute constraints subpattern is present, matchingproceeds as if for the equivalent instance attributes constraint.
  • if there are more positional subpatterns than the length of__match_args__ (as obtained usinglen()),TypeError is raised.
  • Otherwise, positional patterni is converted to an attribute patternusing__match_args__[i] as the attribute name.
  • if any element in__match_args__ is not a string,TypeError is raised.
  • once the positional patterns have been converted to attribute patterns, thenthey are combined with any attribute constraints given in the double starattribute constraints subpattern, and matching proceeds as if for theequivalent instance attributes constraint.

Note: the__match_args__isNone handling in this PEP replaces the specialcasing ofbool,bytearray,bytes,dict,float,frozenset,int,list,set,str, andtuple inPEP 634.However, the optimised fast path for those types is retained in theimplementation.

Design Discussion

Requiring explicit qualification of simple names in match patterns

The first iteration of this PEP accepted the basic premise ofPEP 634 thatiterable unpacking syntax would provide a good foundation for defining a newsyntax for pattern matching.

During the review process, however, two major and one minor ambiguity problemswere highlighted that arise directly from that core assumption:

  • most problematically, when binding simple names by default is extended toPEP 634’s proposed class pattern syntax, theATTR=TARGET_NAME constructbinds to the right without using theas keyword, and uses the normalassignment-to-the-left sigil (=) to do it!
  • when binding simple names by default is extended toPEP 634’s proposed mappingpattern syntax, theKEY:TARGET_NAME construct binds to the right withoutusing theas keyword
  • using aPEP 634 capture pattern together with an AS pattern(TARGET_NAME_1asTARGET_NAME_2) gives an odd “binds to both the left andright” behaviour

The third revision of this PEP accounted for this problem by abandoning thealignment with iterable unpacking syntax, and instead requiring that all usesof bare simple names for anything other than a variable lookup be qualified bya preceding sigil or keyword:

  • asNAME: local variable binding
  • .NAME: attribute lookup
  • ==NAME: variable lookup
  • isNAME: variable lookup
  • any other usage: variable lookup

The key benefit of this approach is that it makes interpretation of simple namesin patterns a local activity: a leadingas indicates a name binding, aleading. indicates an attribute lookup, and anything else is a variablelookup (regardless of whether we’re reading a subpattern or a subexpression).

With the syntax now proposed in this PEP, the problematic cases identified aboveno longer read poorly:

  • .ATTRasTARGET_NAME is more obviously a binding thanATTR=TARGET_NAME
  • KEYasTARGET_NAME is more obviously a binding thanKEY:TARGET_NAME
  • (asTARGET_NAME_1)asTARGET_NAME_2 is more obviously two bindings thanTARGET_NAME_1asTARGET_NAME_2

Resisting the temptation to guess

PEP 635 looks at the way pattern matching is used in other languages, andattempts to use that information to make plausible predictions about the waypattern matching will be used in Python:

  • wanting to extract values to local names willprobably be more common thanwanting to match against values stored in local names
  • wanting comparison by equality willprobably be more common than wantingcomparison by identity
  • users willprobably be able to at least remember that bare names bind valuesand attribute references look up values, even if they can’t figure that outfor themselves without reading the documentation or having someone tell them

To be clear, I think these predictions actuallyare plausible. However, I alsodon’t think we need to guess about this up front: I think we can start out witha more explicit syntax that requires users to state their intent using a prefixmarker (eitheras,==, oris), and then reassess the situation in afew years based on how pattern matching is actually being usedin Python.

At that point, we’ll be able to choose amongst at least the following options:

  • deciding the explicit syntax is concise enough, and not changing anything
  • adding inferred identity constraints for one or more ofNone,...,True andFalse
  • adding inferred equality constraints for other literals (potentially includingcomplex literals)
  • adding inferred equality constraints for attribute lookups
  • adding either inferred equality constraints or inferred capture patterns forbare names

All of those ideas could be considered independently on their own merits, ratherthan being a potential barrier to introducing pattern matching in the firstplace.

If any of these syntactic shortcuts were to eventually be introduced, they’dalso be straightforward to explain in terms of the underlying more explicitsyntax (the leadingas,==, oris would just be getting inferredby the parser, without the user needing to provide it explicitly). At theimplementation level, only the parser should need to be change, as the existingAST nodes could be reused.

Interaction with caching of attribute lookups in local variables

One of the major changes between this PEP andPEP 634 is to use==EXPRfor equality constraint lookups, rather than only offeringNAME.ATTR. Theoriginal motivation for this was to avoid the semantic conflict with regularassignment targets, whereNAME.ATTR is already used in assignment statementsto set attributes, so ifNAME.ATTR were theonly syntax for symbolic valuematching, then we’re pre-emptively ruling out any future attempts to allowmatching against single patterns using the existing assignment statement syntax.The current motivation is more about the general desire to avoid guessing aboutuser’s intent, and instead requiring them to state it explicitly in the syntax.

However, even within match statements themselves, thename.attr syntax forvalue patterns has an undesirable interaction with local variable assignment,where routine refactorings that would be semantically neutral for any otherPython statement introduce a major semantic change when applied to aPEP 634style match statement.

Consider the following code:

whilevalue<self.limit:...# Some code that adjusts "value"

The attribute lookup can be safely lifted out of the loop and only performedonce:

_limit=self.limit:whilevalue<_limit:...# Some code that adjusts "value"

With the marker prefix based syntax proposal in this PEP, value constraintswould be similarly tolerant of match patterns being refactored to use a localvariable instead of an attribute lookup, with the following two statementsbeing functionally equivalent:

matchexpr:case{"key":==self.target}:...# Handle the case where 'expr["key"] == self.target'case__:...# Handle the non-matching case_target=self.targetmatchexpr:case{"key":==_target}:...# Handle the case where 'expr["key"] == self.target'case__:...# Handle the non-matching case

By contrast, when usingPEP 634’s value and capture pattern syntaxes that omitthe marker prefix, the following two statements wouldn’t be equivalent at all:

# PEP 634's value pattern syntaxmatchexpr:case{"key":self.target}:...# Handle the case where 'expr["key"] == self.target'case_:...# Handle the non-matching case# PEP 634's capture pattern syntax_target=self.targetmatchexpr:case{"key":_target}:...# Matches any mapping with "key", binding its value to _targetcase_:...# Handle the non-matching case

This PEP ensures the original semantics are retained under this style ofsimplistic refactoring: use==name to force interpretation of the resultas a value constraint, useasname for a name binding.

PEP 634’s proposal to offer only the shorthand syntax, with no explicitlyprefixed form, means that the primary answer on offer is “Well, don’t do that,then, only compare against attributes in namespaces, don’t compare againstsimple names”.

PEP 622’s walrus pattern syntax had another odd interaction where it might notbind the same object as the exact same walrus expression in the body of thecase clause, butPEP 634 fixed that discrepancy by replacing walrus patternswith AS patterns (where the fact that the value bound to the name on the RHSmight not be the same value as returned by the LHS is a standard feature commonto all uses of the “as” keyword).

Using existing comparison operators as the value constraint prefix

If the benefit of a dedicated value constraint prefix is accepted, then thenext question is to ask exactly what that prefix should be.

The initially published version of this PEP proposed using the previouslyunused? symbol as the prefix for equality constraints, and?is as theprefix for identity constraints. When reviewing the PEP, Steven D’Apranopresented a compelling counterproposal[5] to use the existing comparisonoperators (== andis) instead.

There were a few concerns with== as a prefix that kept it from beingchosen as the prefix in the initial iteration of the PEP:

  • for common use cases, it’s even more visually noisy than?, as a lot offolks withPEP 8 trained aesthetic sensibilities are going to want to puta space between it and the following expression, effectively making it a 3character prefix instead of 1
  • when used in a mapping pattern, there needs to be a space between the:key/value separator and the== prefix, or the tokeniser will split themup incorrectly (getting:= and= instead of: and==)
  • when used in an OR pattern, there needs to be a space between the|pattern separator and the== prefix, or the tokeniser will split themup incorrectly (getting|= and= instead of| and==)
  • if used in aPEP 634 style class pattern, there needs to be a space betweenthe= keyword separator and the== prefix, or the tokeniser will splitthem up incorrectly (getting== and= instead of= and==)

Rather than introducing a completely new symbol, Steven’s proposed resolution tothis verbosity problem was to retain the ability to omit the prefix marker insyntactically unambiguous cases.

While the idea of omitting the prefix marker was accepted for the secondrevision of the proposal, it was dropped again in the third revision due toambiguity concerns. Instead, the following points apply:

  • for class patterns, other syntax changes allow equality constraints to bewritten as.ATTR==EXPR, and identity constraints to be written as.ATTRisEXPR, both of which are quite easy to read
  • for mapping patterns, the extra syntactic noise is just tolerated (at leastfor now)
  • for OR patterns, the extra syntactic noise is just tolerated (at leastfor now). However,membership constraints may offer a future path toreducing the need to combine OR patterns with equality constraints (instead,the values to be checked against would be collected as a set, list, or tuple).

Given that perspective,PEP 635’s arguments against using? as part of thepattern matching syntax held for this proposal as well, and so the PEP wasamended accordingly.

Using__ as the wildcard pattern marker

PEP 635 makes a solid case that introducing?solely as a wildcard patternmarker would be a bad idea. With the syntax for value constraints changedto use existing comparison operations rather than? and?is, thatargument holds for this PEP as well.

However, as noted by Thomas Wouters in[6],PEP 634’s choice of_ remainsproblematic as it would likely mean that match patterns would have apermanentdifference from all other parts of Python - the use of_ in softwareinternationalisation and at the interactive prompt means that there isn’t reallya plausible path towards using it as a general purpose “skipped binding” marker.

__ is an alternative “this value is not needed” marker drawn from a StackOverflow answer[7] (originally posted by the author of this PEP) on thevarious meanings of_ in existing Python code.

This PEP also proposes adopting an implementation technique that limitsthe scope of the associated special casing of__ to the parser: defining anew AST node type (MatchAlways) specifically for wildcard markers, ratherthan passing it through to the AST as aName node.

Within the parser,__ still means either a regular name or a wildcardmarker in a match pattern depending on where you were in the parse tree, butwithin the rest of the compiler,Name("__") is still a normal variable name,whileMatchAlways() is always a wildcard marker in a match pattern.

Unlike_, the lack of other use cases for__ means that there would bea plausible path towards restoring identifier handling consistency with the restof the language by making__ mean “skip this name binding” everywhere inPython:

  • in the interpreter itself, deprecate loading variables with the name__.This would make reading from__ emit a deprecation warning, while writingto it would initially be unchanged. To avoid slowing down all name loads, thiscould be handled by having the compiler emit additional code for thedeprecated name, rather than using a runtime check in the standard nameloading opcodes.
  • after a suitable number of releases, change the parser to emita newSkippedBinding AST node for all uses of__ as an assignmenttarget, and update the rest of the compiler accordingly
  • consider making__ a true hard keyword rather than a soft keyword

This deprecation path couldn’t be followed for_, as there’s no way for theinterpreter to distinguish between attempts to read back_ when nominallyused as a “don’t care” marker, and legitimate reads of_ as either ani18n text translation function or as the last statement result at theinteractive prompt.

Names starting with double-underscores are also already reserved for use by thelanguage, whether that is for compile time constants (i.e.__debug__),special methods, or class attribute name mangling, so using__ here wouldbe consistent with that existing approach.

Representing patterns explicitly in the Abstract Syntax Tree

PEP 634 doesn’t explicitly discuss how match statements should be representedin the Abstract Syntax Tree, instead leaving that detail to be defined as partof the implementation.

As a result, while the reference implementation ofPEP 634 definitely works (andformed the basis of the reference implementation of this PEP), it does containa significant design flaw: despite the notes inPEP 635 that patterns should beconsidered as distinct from expressions, the reference implementation goes aheadand represents them in the AST as expression nodes.

The result is an AST that isn’t very abstract at all: nodes that should becompiled completely differently (because they’re patterns rather thanexpressions) are represented the same way, and the type system of theimplementation language (e.g. C for CPython) can’t offer any assistance inkeeping track of which subnodes should be ordinary expressions and which shouldbe subpatterns.

Rather than continuing with that approach, this PEP has instead defined a newexplicit “pattern” node in the AST, which allows the patterns and theirpermitted subnodes to be defined explicitly in the AST itself, making the codeimplementing the new feature clearer, and allowing the C compiler to providemore assistance in keeping track of when the code generator is dealing withpatterns or expressions.

This change in implementation approach is actually orthogonal to the surfacesyntax changes proposed in this PEP, so it could still be adopted even if therest of the PEP were to be rejected.

Changes to sequence patterns

This PEP makes one notable change to sequence patterns relative toPEP 634:

  • only the square bracket form of sequence pattern is supported. Neither open(no delimiters) nor tuple style (parentheses as delimiters) sequence patternsare supported.

Relative toPEP 634, sequence patterns are also significantly affected by thechange to require explicit qualification of capture patterns and valueconstraints, as it meanscase[a,b,c]: must instead be written ascase[asa,asb,asc]: andcase[0,1]: must instead be written ascase[==0,==1]:.

With the syntax for sequence patterns no longer being derived directly from thesyntax for iterable unpacking, it no longer made sense to keep the syntacticflexibility that had been included in the original syntax proposal purely forconsistency with iterable unpacking.

Allowing open and tuple style sequence patterns didn’t increase expressivity,only ambiguity of intent (especially relative to group patterns), and encouragedreaders down the path of viewing pattern matching syntax as intrinsically linkedto assignment target syntax (which thePEP 634 authors have stated multipletimes is not a desirable path to have readers take, and a view the author ofthis PEP now shares, despite disagreeing with it originally).

Changes to mapping patterns

This PEP makes two notable changes to mapping patterns relative toPEP 634:

  • value capturing is written asKEYasNAME rather than asKEY:NAME
  • a wider range of keys are permitted: any “closed expression”, rather thanonly literals and attribute references

As discussed above, the first change is part of ensuring that all bindingoperations with the target name to the right of a subexpression or patternuse theas keyword.

The second change is mostly a matter of simplifying the parser and codegenerator code by reusing the existing expression handling machinery. Therestriction to closed expressions is designed to help reduce ambiguity as towhere the key expression ends and the match pattern begins. This mostly allowsa superset of whatPEP 634 allows, except that complex literals must be writtenin parentheses (at least for now).

AdaptingPEP 635’s mapping pattern examples to the syntax proposed in this PEP:

matchjson_pet:case{"type":=="cat","name"asname,"pattern"aspattern}:returnCat(name,pattern)case{"type":=="dog","name"asname,"breed"asbreed}:returnDog(name,breed)case__:raiseValueError("Not a suitable pet")defchange_red_to_blue(json_obj):matchjson_obj:case{'color':(=='red'|=='#FF0000')}:json_obj['color']='blue'case{'children'aschildren}:forchildinchildren:change_red_to_blue(child)

For reference, the equivalentPEP 634 syntax:

matchjson_pet:case{"type":"cat","name":name,"pattern":pattern}:returnCat(name,pattern)case{"type":"dog","name":name,"breed":breed}:returnDog(name,breed)case_:raiseValueError("Not a suitable pet")defchange_red_to_blue(json_obj):matchjson_obj:case{'color':('red'|'#FF0000')}:json_obj['color']='blue'case{'children':children}:forchildinchildren:change_red_to_blue(child)

Changes to class patterns

This PEP makes several notable changes to class patterns relative toPEP 634:

  • the syntactic alignment with class instantiation is abandoned as beingactively misleading and unhelpful. Instead, a new dedicated syntax forchecking additional attributes is introduced that draws inspiration frommapping patterns rather than class instantiation
  • a new dedicated syntax for simple ducktyping that will work for any classis introduced
  • the special casing of various builtin and standard library types issupplemented by a general check for the existence of a__match_args__attribute with the value ofNone

As discussed above, the first change has two purposes:

  • it’s part of ensuring that all binding operations with the target name to theright of a subexpression or pattern use theas keyword. Using= toassign to the right is particularly problematic.
  • it’s part of ensuring that all uses of simple names in patterns have a prefixthat indicates their purpose (in this case, a leading. to indicate anattribute lookup)

The syntactic alignment with class instantion was also judged to be unhelpfulin general, as class patterns are about matching patterns against attributes,while class instantiation is about matching call arguments to parameters inclass constructors, which may not bear much resemblance to the resultinginstance attributes at all.

The second change is intended to make it easier to use pattern matching for the“ducktyping” style checks that are already common in Python.

The concrete syntax proposal for these patterns then arose from viewinginstances as mappings of attribute names to values, and combining the attributelookup syntax (.ATTR), with the mapping pattern syntax{KEY:PATTERN}to givecls{.ATTR:PATTERN}.

Allowingcls{.ATTR} to mean the same thing ascls{.ATTR:__} was amatter of considering the leading. sufficient to render the name usageunambiguous (it’s clearly an attribute reference, whereas matching against a variablekey in a mapping pattern would be arguably ambiguous)

The final change just supplements a CPython-internal-only check in thePEP 634reference implementation by making it the default behaviour that classes get ifthey don’t define__match_args__ (the optimised fast path for the builtinand standard library types named inPEP 634 is retained).

Adapting the class matching examplelinked from PEP 635shows that for purely positional class matching, the main impact comes from thechanges to value constraints and name binding, not from the class matchingchanges:

matchexpr:caseBinaryOp(=='+',asleft,asright):returneval_expr(left)+eval_expr(right)caseBinaryOp(=='-',asleft,asright):returneval_expr(left)-eval_expr(right)caseBinaryOp(=='*',asleft,asright):returneval_expr(left)*eval_expr(right)caseBinaryOp(=='/',asleft,asright):returneval_expr(left)/eval_expr(right)caseUnaryOp(=='+',asarg):returneval_expr(arg)caseUnaryOp(=='-',asarg):return-eval_expr(arg)caseVarExpr(asname):raiseValueError(f"Unknown value of:{name}")casefloat()|int():returnexprcase__:raiseValueError(f"Invalid expression value:{repr(expr)}")

For reference, the equivalentPEP 634 syntax:

matchexpr:caseBinaryOp('+',left,right):returneval_expr(left)+eval_expr(right)caseBinaryOp('-',left,right):returneval_expr(left)-eval_expr(right)caseBinaryOp('*',left,right):returneval_expr(left)*eval_expr(right)caseBinaryOp('/',left,right):returneval_expr(left)/eval_expr(right)caseUnaryOp('+',arg):returneval_expr(arg)caseUnaryOp('-',arg):return-eval_expr(arg)caseVarExpr(name):raiseValueError(f"Unknown value of:{name}")casefloat()|int():returnexprcase_:raiseValueError(f"Invalid expression value:{repr(expr)}")

The changes to the class pattern syntax itself are more relevant whenchecking for named attributes and extracting their values without relying on__match_args__:

matchexpr:caseobject{.hostashost,.portasport}:passcaseobject{.hostashost}:pass

Compare this to thePEP 634 equivalent, where it really isn’t clear which namesare referring to attributes of the match subject and which names are referringto local variables:

matchexpr:caseobject(host=host,port=port):passcaseobject(host=host):pass

In this specific case, that ambiguity doesn’t matter (since the attribute andvariable names are the same), but in the general case, knowing which is whichwill be critical to reasoning correctly about the code being read.

Deferred Ideas

Inferred value constraints

As discussed above, this PEP doesn’t rule out the possibility of addinginferred equality and identity constraints in the future.

These could be particularly valuable for literals, as it is quite likely thatmany “magic” strings and numbers with self-evident meanings will be writtendirectly into match patterns, rather than being stored in named variables.(Think constants likeNone, or obviously special numbers like0 and1, or strings where their contents are as descriptive as any variable name,rather than cryptic checks against opaque numbers like739452)

Making some required parentheses optional

The PEP currently errs heavily on the side of requiring parentheses in the faceof potential ambiguity.

However, there are a number of cases where it at least arguably goes too far,mostly involving AS patterns with an explicit pattern.

In any position that requires a closed pattern, AS patterns may end up startingwith doubled parentheses, as the nested pattern is also required to be a closedpattern:((OPENPTRN)asNAME)

Due to the requirement that the subpattern be closed, it should be reasonablein many of these cases (e.g. sequence pattern subpatterns) to acceptCLOSED_PTRNasNAME directly.

Further consideration of this point has been deferred, as making requiredparentheses optional is a backwards compatible change, and hence relaxing therestrictions later can be considered on a case-by-case basis.

Accepting complex literals as closed expressions

PEP 634’s reference implementation includes a lot of special casing of binaryoperations in both the parser and the rest of the compiler in order to acceptcomplex literals without accepting arbitrary binary numeric operations onliteral values.

Ideally, this problem would be dealt with at the parser layer, with the parserdirectly emitting a Constant AST node prepopulated with a complex number. Ifthat was the way things worked, then complex literals could be accepted througha similar mechanism to any other literal.

This isn’t how complex literals are handled, however. Instead, they’re passedthrough to the AST as regularBinOp nodes, and then the constant foldingpass on the AST resolves them down toConstant nodes with a complex value.

For the parser to resolve complex literals directly, the compiler would need tobe able to tell the tokenizer to generate a distinct token type forimaginary numbers (e.g.INUMBER), which would then allow the parser tohandleNUMBER+INUMBER andNUMBER-INUMBER separately from otherbinary operations.

Alternatively, a newComplexNumber AST node type could be defined, whichwould allow the parser to notify the subsequent compiler stages that aparticular node should specifically be a complex literal, rather than anarbitrary binary operation. Then the parser could acceptNUMBER+NUMBERandNUMBER-NUMBER for that node, while letting the AST validation forComplexNumber take care of ensuring that the real and imaginary parts ofthe literal were real and imaginary numbers as expected.

For now, this PEP has postponed dealing with this question, and instead justrequires that complex literals be parenthesised in order to be used in valueconstraints and as mapping pattern keys.

Allowing negated constraints in match patterns

With the syntax proposed in this PEP, it isn’t permitted to write!=exprorisnotexpr as a match pattern.

Both of these forms have clear potential interpretations as a negated equalityconstraint (i.e.x!=expr) and a negated identity constraint(i.e.xisnotexpr).

However, it’s far from clear either form would come up often enough to justifythe dedicated syntax, so the possible extension has been deferred pending furthercommunity experience with match statements.

Allowing membership checks in match patterns

The syntax used for equality and identity constraints would be straightforwardto extend to membership checks:incontainer.

One downside of the proposals in both this PEP andPEP 634 is that checkingfor multiple values in the same case doesn’t look like any existing containermembership check in Python:

# PEP 634's literal patternsmatchvalue:case0|1|2|3:...# This PEP's equality constraintsmatchvalue:case==0|==1|==2|==3:...

Allowing inferred equality constraints under this PEP would only make it looklike thePEP 634 example, it still wouldn’t look like the equivalentifstatement header (ifvaluein{0,1,2,3}:).

Membership constraints would provide a more explicit, but still concise, wayto check if the match subject was present in a container, and it would lookthe same as an ordinary containment check:

matchvalue:casein{0,1,2,3}:...casein{one,two,three,four}:...caseinrange(4):# It would accept any container, not just literal sets...

Such a feature would also be readily extensible to allow all kinds of caseclauses without any further syntax updates, simply by defining__contains__appropriately on a custom class definition.

However, while this does seem like a useful extension, and a good way to resolvethis PEP’s verbosity problem when combining multiple equality checks in anOR pattern, it isn’t essential to making match statements a valuable additionto the language, so it seems more appropriate to defer it to a separate proposal,rather than including it here.

Inferring a default type for instance attribute constraints

The dedicated syntax for instance attribute constraints means thatobjectcould be omitted fromobject{.ATTR} to give{.ATTR} without introducingany syntactic ambiguity (if no class was given,object would be implied,just as it is for the base class list in class definitions).

However, it’s far from clear saving six characters is worth making it harder tovisually distinguish mapping patterns from instance attribute patterns, soallowing this has been deferred as a topic for possible future consideration.

Avoiding special cases in sequence patterns

Sequence patterns in both this PEP andPEP 634 currently special casestr,bytes, andbytearray as specificallynever matching a sequencepattern.

This special casing could potentially be removed if we were to define a newcollections.abc.AtomicSequence abstract base class for types like these,where they’re conceptually a single item, but still implement the sequenceprotocol to allow random access to their component parts.

Expression syntax to retrieve multiple attributes from an instance

The instance attribute pattern syntax has been designed such that it couldbe used as the basis for a general purpose syntax for retrieving multipleattributes from an object in a single expression:

host,port=obj{.host,.port}

Similar to slice syntax only being allowed inside bracket subscrpts, the.attr syntax for naming attributes would only be allowed inside bracesubscripts.

This idea isn’t required for pattern matching to be useful, so it isn’t part ofthis PEP. However, it’s mentioned as a possible path towards making patternmatching feel more integrated into the rest of the language, rather thanexisting forever in its own completely separated world.

Expression syntax to retrieve multiple attributes from an instance

If the brace subscript syntax were to be accepted for instance attributepattern matching, and then subsequently extended to offer general purposeextraction of multiple attributes, then it could be extended even further toallow for retrieval of multiple items from containers based on the syntaxused for mapping pattern matching:

host,port=obj{"host","port"}first,last=obj{0,-1}

Again, this idea isn’t required for pattern matching to be useful, so it isn’tpart of this PEP. As with retrieving multiple attributes, however, it isincluded as an example of the proposed pattern matching syntax inspiring ideasfor making object deconstruction easier in general.

Rejected Ideas

Restricting permitted expressions in value constraints and mapping pattern keys

While it’s entirely technically possible to restrict the kinds of expressionspermitted in value constraints and mapping pattern keys to just attributelookups and constant literals (asPEP 634 does), there isn’t any clear runtimevalue in doing so, so this PEP proposes allowing any kind of primary expression(primary expressions are an existing node type in the grammar that includesthings like literals, names, attribute lookups, function calls, containersubscripts, parenthesised groups, etc), as well as high precedence unaryoperations (+,-,~) on primary expressions.

WhilePEP 635 does emphasise several times that literal patterns and valuepatterns are not full expressions, it doesn’t ever articulate a concrete benefitthat is obtained from that restriction (just a theoretical appeal to it beinguseful to separate static checks from dynamic checks, which a code styletool could still enforce, even if the compiler itself is more permissive).

The last time we imposed such a restriction was for decorator expressions andthe primary outcome of that was that users had to put up with years of awkwardsyntactic workarounds (like nesting arbitrary expressions inside function callsthat just returned their argument) to express the behaviour they wanted beforethe language definition was finally updated to allow arbitrary expressions andlet users make their own decisions about readability.

The situation inPEP 634 that bears a resemblance to the situation with decoratorexpressions is that arbitrary expressions are technically supported in valuepatterns, they just require awkward workarounds where either all the values tomatch need to be specified in a helper class that is placed before the matchstatement:

# Allowing arbitrary match targets with PEP 634's value pattern syntaxclassmt:value=func()matchexpr:case(_,mt.value):...# Handle the case where 'expr[1] == func()'

Or else they need to be written as a combination of a capture pattern and aguard expression:

# Allowing arbitrary match targets with PEP 634's guard expressionsmatchexpr:case(_,_matched)if_matched==func():...# Handle the case where 'expr[1] == func()'

This PEP proposes skipping requiring any such workarounds, and insteadsupporting arbitrary value constraints from the start:

matchexpr:case(__,==func()):...# Handle the case where 'expr == func()'

Whether actually writing that kind of code is a good idea would be a topic forstyle guides and code linters, not the language compiler.

In particular, if static analysers can’t follow certain kinds of dynamic checks,then they can limit the permitted expressions at analysis time, rather than thecompiler restricting them at compile time.

There are also some kinds of expressions that are almost certain to givenonsensical results (e.g.yield,yieldfrom,await) due to thepattern caching rule, where the number of times the constraint expressionactually gets evaluated will be implementation dependent. Even here, the PEPtakes the view of letting users write nonsense if they really want to.

Aside from the recently updated decorator expressions, another situation wherePython’s formal syntax offers full freedom of expression that is almost neverused in practice is inexcept clauses: the exceptions to match againstalmost always take the form of a simple name, a dotted name, or a tuple ofthose, but the language grammar permits arbitrary expressions at that point.This is a good indication that Python’s user base can be trusted totake responsibility for finding readable ways to use permissive languagefeatures, by avoiding writing hard to read constructs even when they’repermitted by the compiler.

This permissiveness comes with a real concrete benefit on the implementationside: dozens of lines of match statement specific code in the compiler isreplaced by simple calls to the existing code for compiling expressions(including in the AST validation pass, the AST optimization pass, the symboltable analysis pass, and the code generation pass). This implementationbenefit would accrue not just to CPython, but to every other Pythonimplementation looking to add match statement support.

Requiring the use of constraint prefix markers for mapping pattern keys

The initial (unpublished) draft of this proposal suggested requiring mappingpattern keys be value constraints, just asPEP 634 requires that they be validliteral or value patterns:

importconstantsmatchconfig:case{=="route":route}:process_route(route)case{==constants.DEFAULT_PORT:sub_config,**rest}:process_config(sub_config,rest)

However, the extra characters were syntactically noisy and unlike its use invalue constraints (where it distinguishes them from non-pattern expressions),the prefix doesn’t provide any additional information here that isn’t alreadyconveyed by the expression’s position as a key within a mapping pattern.

Accordingly, the proposal was simplified to omit the marker prefix from mappingpattern keys.

This omission also aligns with the fact that containers may incorporate bothidentity and equality checks into their lookup process - they don’t purelyrely on equality checks, as would be incorrectly implied by the use of theequality constraint prefix.

Allowing the key/value separator to be omitted for mapping value constraints

Instance attribute patterns allow the: separator to be omitted whenwriting attribute value constraints likecaseobject{.attr==expr}.

Offering a similar shorthand for mapping value constraints was considered, butpermitting it allows thoroughly baffling constructs likecase{0==0}:where the compiler knows this is the key0 with the value constraint==0, but a human reader sees the tautological comparison operation0==0. With the key/value separator included, the intent is more obvious toa human reader as well:case{0:==0}:

Reference Implementation

A draft reference implementation for this PEP[3] has been derived from BrandtBucher’s reference implementation forPEP 634[4].

Relative to the text of this PEP, the draft reference implementation has notyet complemented the special casing of several builtin and standard librarytypes inMATCH_CLASS with the more general check for__match_args__being set toNone. Class defined patterns also currently still acceptclasses that don’t define__match_args__.

All other modified patterns have been updated to follow this PEP rather thanPEP 634.

Unparsing for match patterns has not yet been migrated to the updated v3 AST.

The AST validator for match patterns has not yet been implemented.

The AST validator in general has not yet been reviewed to ensure that it ischecking that only expression nodes are being passed in where expression nodesare expected.

The examples in this PEP have not yet been converted to test cases, so couldplausibly contain typos and other errors.

Several of the oldPEP 634 tests are still to be converted to new SyntaxErrortests.

The documentation has not yet been updated.

Acknowledgments

ThePEP 622 andPEP 634/PEP 635/PEP 636 authors, as the proposal inthis PEP is merelyan attempt to improve the readability of an already well-constructed idea byproposing that starting with a more explicit syntax and potentially introducingsyntactic shortcuts for particularly common operations later is a better optionthan attempting toonly define the shortcut version. For areas of thespecification where the two PEPs are the same (or at least very similar), thetext describing the intended behaviour in this PEP is often derived directlyfrom thePEP 634 text.

Steven D’Aprano, who made a compelling case that the key goals of this PEP couldbe achieved by using existing comparison tokens to tell the ability to overridethe compiler when our guesses as to “what most users will want most of the time”are inevitably incorrect for at least some users some of the time, and retainingsome ofPEP 634’s syntactic sugar (with a slightly different semantic definition)to obtain the same level of brevity asPEP 634 in most situations. (PaulSokolosvsky also independently suggested using== instead of? as amore easily understood prefix for equality constraints).

Thomas Wouters, whose publication ofPEP 640 and public review of the structuredpattern matching proposals persuaded the author of this PEP to continueadvocating for a wildcard pattern syntax that a future PEP could plausibly turninto a hard keyword that always skips binding a reference in any location asimple name is expected, rather than continuing indefinitely as the matchpattern specific soft keyword that is proposed here.

Joao Bueno and Jim Jewett for nudging the PEP author to take a closer look atthe proposed syntax for subelement capturing within class patterns and mappingpatterns (particularly the problems with “capturing to the right”). Thisreview is what prompted the significant changes between v2 and v3 of theproposal.

References

[1]
Post explaining the syntactic novelties in PEP 622https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4EE243QT3QNNCO7XFZYZGIY6N3/>
[2]
Declined pull request proposing to list this as a Rejected Idea in PEP 622https://github.com/python/peps/pull/1564
[3]
In-progress reference implementation for this PEPhttps://github.com/ncoghlan/cpython/tree/pep-642-constraint-patterns
[4]
PEP 634 reference implementationhttps://github.com/python/cpython/pull/22917
[5]
Steven D’Aprano’s cogent criticism of the first published iteration of this PEPhttps://mail.python.org/archives/list/python-dev@python.org/message/BTHFWG6MWLHALOD6CHTUFPHAR65YN6BP/
[6]
Thomas Wouter’s initial review of the structured pattern matching proposalshttps://mail.python.org/archives/list/python-dev@python.org/thread/4SBR3J5IQUYE752KR7C6432HNBSYKC5X/
[7]
Stack Overflow answer regarding the use cases for_ as an identifierhttps://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python/5893946#5893946
[8]
Pre-publication draft of “Precise Semantics for Pattern Matching”https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst
[9]
Kohn et al., Dynamic Pattern Matching with Pythonhttps://gvanrossum.github.io/docs/PyPatternMatching.pdf

Appendix A – Full Grammar

Here is the full modified grammar formatch_stmt, replacing Appendix AinPEP 634.

Notation used beyond standard EBNF is as perPEP 534:

  • 'KWD' denotes a hard keyword
  • "KWD" denotes a soft keyword
  • SEP.RULE+ is shorthand forRULE(SEPRULE)*
  • !RULE is a negative lookahead assertion
match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENTsubject_expr:    | star_named_expression ',' [star_named_expressions]    | named_expressioncase_block: "case" (guarded_pattern | open_pattern) ':' blockguarded_pattern: closed_pattern 'if' named_expressionopen_pattern: # Pattern may use multiple tokens with no closing delimiter    | as_pattern    | or_patternas_pattern: [closed_pattern] pattern_as_clauseas_pattern_with_inferred_wildcard: pattern_as_clausepattern_as_clause: 'as' pattern_capture_targetpattern_capture_target: !"__" NAME !('.' | '(' | '=')or_pattern: '|'.simple_pattern+simple_pattern: # Subnode where "as" and "or" patterns must be parenthesised    | closed_pattern    | value_constraintvalue_constraint:    | eq_constraint    | id_constrainteq_constraint: '==' closed_exprid_constraint: 'is' closed_exprclosed_expr: # Require a single token or a closing delimiter in expression    | primary    | closed_factorclosed_factor: # "factor" is the main grammar node for these unary ops    | '+' primary    | '-' primary    | '~' primaryclosed_pattern: # Require a single token or a closing delimiter in pattern    | wildcard_pattern    | group_pattern    | structural_constraintwildcard_pattern: "__"group_pattern: '(' open_pattern ')'structural_constraint:    | sequence_constraint    | mapping_constraint    | attrs_constraint    | class_constraintsequence_constraint: '[' [sequence_constraint_elements] ']'sequence_constraint_elements: ','.sequence_constraint_element+ ','?sequence_constraint_element:    | star_pattern    | simple_pattern    | as_pattern_with_inferred_wildcardstar_pattern: '*' (pattern_as_clause | wildcard_pattern)mapping_constraint: '{' [mapping_constraint_elements] '}'mapping_constraint_elements: ','.key_value_constraint+ ','?key_value_constraint:    | closed_expr pattern_as_clause    | closed_expr ':' simple_pattern    | double_star_capturedouble_star_capture: '**' pattern_as_clauseattrs_constraint:    | name_or_attr '{' [attrs_constraint_elements] '}'name_or_attr: attr | NAMEattr: name_or_attr '.' NAMEattrs_constraint_elements: ','.attr_value_constraint+ ','?attr_value_constraint:    | '.' NAME pattern_as_clause    | '.' NAME value_constraint    | '.' NAME ':' simple_pattern    | '.' NAMEclass_constraint:    | name_or_attr '(' ')'    | name_or_attr '(' positional_patterns ','? ')'    | name_or_attr '(' class_constraint_attrs ')'    | name_or_attr '(' positional_patterns ',' class_constraint_attrs] ')'positional_patterns: ','.positional_pattern+positional_pattern:    | simple_pattern    | as_pattern_with_inferred_wildcardclass_constraint_attrs:    | '**' '{' [attrs_constraint_elements] '}'

Appendix B: Summary of Abstract Syntax Tree changes

The following new nodes are added to the AST by this PEP:

stmt = ...      | ...      | Match(expr subject, match_case* cases)      | ...      ...match_case = (pattern pattern, expr? guard, stmt* body)pattern = MatchAlways     | MatchValue(matchop op, expr value)     | MatchSequence(pattern* patterns)     | MatchMapping(expr* keys, pattern* patterns)     | MatchAttrs(expr cls, identifier* attrs, pattern* patterns)     | MatchClass(expr cls, pattern* patterns, identifier* extra_attrs, pattern* extra_patterns)     | MatchRestOfSequence(identifier? target)     -- A NULL entry in the MatchMapping key list handles capturing extra mapping keys     | MatchAs(pattern? pattern, identifier target)     | MatchOr(pattern* patterns)      attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)matchop = EqCheck | IdCheck

Appendix C: Summary of changes relative to PEP 634

The overallmatch/case statement syntax and the guard expression syntaxremain the same as they are inPEP 634.

Relative toPEP 634 this PEP makes the following key changes:

  • a newpattern type is defined in the AST, rather than reusing theexprtype for patterns
  • the newMatchAs andMatchOr AST nodes are moved from theexprtype to thepattern type
  • the wildcard pattern changes from_ (single underscore) to__ (doubleunderscore), and gains a dedicatedMatchAlways node in the AST
  • due to ambiguity of intent, value patterns and literal patterns are removed
  • a new expression category is introduced: “closed expressions”
  • closed expressions are either primary expressions, or a closed expressionpreceded by one of the high precedence unary operators (+,-,~)
  • a new pattern type is introduced: “value constraint patterns”
  • value constraints have a dedicatedMatchValue AST node rather thanallowing a combination ofConstant (literals),UnaryOp(negative numbers),BinOp (complex numbers), andAttribute (attributelookups)
  • value constraint patterns are either equality constraints or identity constraints
  • equality constraints use== as a prefix marker on an otherwisearbitrary closed expression:==EXPR
  • identity constraints useis as a prefix marker on an otherwisearbitrary closed expression:isEXPR
  • due to ambiguity of intent, capture patterns are removed. All capture operationsuse theas keyword (even in sequence matching) and are represented in theAST as eitherMatchAs orMatchRestOfSequence nodes.
  • to reduce verbosity in AS patterns,asNAME is permitted, with the samemeaning as__asNAME
  • sequence patterns change torequire the use of square brackets, rather thanoffering the same syntactic flexibility as assignment targets (assignmentstatements allow iterable unpacking to be indicated by any use of a tupleseparated target, with or without surrounding parentheses or square brackets)
  • sequence patterns gain a dedicatedMatchSequence AST node rather thanreusingList
  • mapping patterns change to allow arbitrary closed expressions as keys
  • mapping patterns gain a dedicatedMatchMapping AST node rather thanreusingDict
  • to reduce verbosity in mapping patterns,KEY:__asNAME may be shortenedtoKEYasNAME
  • class patterns no longer use individual keyword argument syntax for attributematching. Instead they use double-star syntax, along with a variant on mappingpattern syntax with a dot prefix on the attribute names
  • class patterns gain a dedicatedMatchClass AST node rather thanreusingCall
  • to reduce verbosity, class attribute matching allows: to be omitted whenthe pattern to be matched starts with==,is, oras
  • class patterns treat any class that sets__match_args__ toNone asaccepting a single positional pattern that is matched against the entireobject (avoiding the special casing required inPEP 634)
  • class patterns raiseTypeError when used with an object that does notdefine__match_args__
  • dedicated syntax for ducktyping is added, such thatcasecls{...}: isroughly equivalent tocasecls(**{...}):, but skips the check for theexistence of__match_args__. This pattern also has a dedicated AST node,MatchAttrs

Note that postponing literal patterns also makes it possible to postpone thequestion of whether we need an “INUMBER” token in the tokeniser for imaginaryliterals. Without it, the parser can’t distinguish complex literals from otherbinary addition and subtraction operations on constants, so proposals likePEP 634 have to do work in later compilation steps to check for correct usage.

Appendix D: History of changes to this proposal

The first published iteration of this proposal mostly followedPEP 634, butsuggested using?EXPR for equality constraints and?isEXPR foridentity constraints rather thanPEP 634’s value patterns and literal patterns.

The second published iteration mostly adopted a counter-proposal from StevenD’Aprano that kept thePEP 634 style inferred constraints in many situations,but also allowed the use of==EXPR for explicit equality constraints, andisEXPR for explicit identity constraints.

The third published (and current) iteration dropped inferred patterns entirely,in an attempt to resolve the concerns with the fact that the patternscase{key:NAME}: andcasecls(attr=NAME): would both bindNAMEdespite it appearing to the right of another subexpression without using theas keyword. The revised proposal also eliminates the possibility of writingcaseTARGET1asTARGET2:, which would bind to both of the given names. Ofthose changes, the most concerning wascasecls(attr=TARGET_NAME):, since itinvolved the use of= with the binding target on the right, the exactopposite of what happens in assignment statements, function calls, andfunction signature declarations.

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0642.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp