Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 634 – Structural Pattern Matching: Specification

Author:
Brandt Bucher <brandt at python.org>,Guido van Rossum <guido at python.org>
BDFL-Delegate:

Discussions-To:
Python-Dev list
Status:
Final
Type:
Standards Track
Created:
12-Sep-2020
Python-Version:
3.10
Post-History:
22-Oct-2020, 08-Feb-2021
Replaces:
622
Resolution:
Python-Committers message

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found atThe match statement.

×

SeePEP 1 for how to propose changes.

Abstract

This PEP provides the technical specification for the matchstatement. It replacesPEP 622, which is hereby split in three parts:

This PEP is intentionally devoid of commentary; the motivation and allexplanations of our design choices are inPEP 635. First-time readersare encouraged to start withPEP 636, which provides a gentlerintroduction to the concepts, syntax and semantics of patterns.

Syntax and Semantics

SeeAppendix A for the complete grammar.

Overview and Terminology

The pattern matching process takes as input a pattern (followingcase) and a subject value (followingmatch). Phrases todescribe the process include “the pattern is matched with (or against)the subject value” and “we match the pattern against (or with) thesubject value”.

The primary outcome of pattern matching is success or failure. Incase of success we may say “the pattern succeeds”, “the matchsucceeds”, or “the pattern matches the subject value”.

In many cases a pattern contains subpatterns, and success or failureis determined by the success or failure of matching those subpatternsagainst the value (e.g., for OR patterns) or against parts of thevalue (e.g., for sequence patterns). This process typically processesthe subpatterns from left to right until the overall outcome isdetermined. E.g., an OR pattern succeeds at the first succeedingsubpattern, while a sequence patterns fails at the first failingsubpattern.

A secondary outcome of pattern matching may be one or more namebindings. We may say “the pattern binds a value to a name”. Whensubpatterns tried until the first success, only the bindings due tothe successful subpattern are valid; when trying until the firstfailure, the bindings are merged. Several more rules, explainedbelow, apply to these cases.

The Match Statement

Syntax:

match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENTsubject_expr:    | star_named_expression ',' star_named_expressions?    | named_expressioncase_block: "case" patterns [guard] ':' blockguard: 'if' named_expression

The rulesstar_named_expression,star_named_expressions,named_expression andblock are part of thestandard Pythongrammar.

The rulepatterns is specified below.

For context,match_stmt is a new alternative forcompound_statement:

compound_statement:|if_stmt...|match_stmt

Thematch andcase keywords are soft keywords, i.e. they arenot reserved words in other grammatical contexts (including at thestart of a line if there is no colon where expected). This impliesthat they are recognized as keywords when part of a matchstatement or case block only, and are allowed to be used in allother contexts as variable or argument names.

Match Semantics

The match statement first evaluates the subject expression. If acomma is present a tuple is constructed using the standard rules.

The resulting subject value is then used to select the first caseblock whose patterns succeeds matching itand whose guard condition(if present) is “truthy”. If no case blocks qualify the matchstatement is complete; otherwise, the block of the selected case blockis executed. The usual rules for executing a block nested inside acompound statement apply (e.g. anif statement).

Name bindings made during a successful pattern match outlive theexecuted block and can be used after the match statement.

During failed pattern matches, some subpatterns may succeed. Forexample, while matching the pattern(0,x,1) with the value[0,1,2], the subpatternx may succeed if the list elements arematched from left to right. The implementation may choose to eithermake persistent bindings for those partial matches or not. User codeincluding a match statement should not rely on the bindings beingmade for a failed match, but also shouldn’t assume that variables areunchanged by a failed match. This part of the behavior is leftintentionally unspecified so different implementations can addoptimizations, and to prevent introducing semantic restrictions thatcould limit the extensibility of this feature.

The precise pattern binding rules vary per pattern type and arespecified below.

Guards

If a guard is present on a case block, once the pattern or patterns inthe case block succeed, the expression in the guard is evaluated. Ifthis raises an exception, the exception bubbles up. Otherwise, if thecondition is “truthy” the case block is selected; if it is “falsy” thecase block is not selected.

Since guards are expressions they are allowed to have side effects.Guard evaluation must proceed from the first to the last case block,one at a time, skipping case blocks whose pattern(s) don’t allsucceed. (I.e., even if determining whether those patterns succeedmay happen out of order, guard evaluation must happen in order.)Guard evaluation must stop once a case block is selected.

Irrefutable case blocks

A pattern is considered irrefutable if we can prove from its syntaxalone that it will always succeed. In particular, capture patternsand wildcard patterns are irrefutable, and so are AS patterns whoseleft-hand side is irrefutable, OR patterns containing at leastone irrefutable pattern, and parenthesized irrefutable patterns.

A case block is considered irrefutable if it has no guard and itspattern is irrefutable.

A match statement may have at most one irrefutable case block, and itmust be last.

Patterns

The top-level syntax for patterns is as follows:

patterns:open_sequence_pattern|patternpattern:as_pattern|or_patternas_pattern:or_pattern'as'capture_patternor_pattern:'|'.closed_pattern+closed_pattern:|literal_pattern|capture_pattern|wildcard_pattern|value_pattern|group_pattern|sequence_pattern|mapping_pattern|class_pattern

AS Patterns

Syntax:

as_pattern:or_pattern'as'capture_pattern

(Note: the name on the right may not be_.)

An AS pattern matches the OR pattern on the left of theaskeyword against the subject. If this fails, the AS pattern fails.Otherwise, the AS pattern binds the subject to the name on the rightof theas keyword and succeeds.

OR Patterns

Syntax:

or_pattern:'|'.closed_pattern+

When two or more patterns are separated by vertical bars (|),this is called an OR pattern. (A single closed pattern is just that.)

Only the final subpattern may be irrefutable.

Each subpattern must bind the same set of names.

An OR pattern matches each of its subpatterns in turn to the subject,until one succeeds. The OR pattern is then deemed to succeed.If none of the subpatterns succeed the OR pattern fails.

Literal Patterns

Syntax:

literal_pattern:|signed_number|signed_number'+'NUMBER|signed_number'-'NUMBER|strings|'None'|'True'|'False'signed_number:NUMBER|'-'NUMBER

The rulestrings and the tokenNUMBER are defined in thestandard Python grammar.

Triple-quoted strings are supported. Raw strings and byte stringsare supported. F-strings are not supported.

The formssigned_number'+'NUMBER andsigned_number'-'NUMBER are only permitted to express complex numbers; they require areal number on the left and an imaginary number on the right.

A literal pattern succeeds if the subject value compares equal to thevalue expressed by the literal, using the following comparisons rules:

  • Numbers and strings are compared using the== operator.
  • The singleton literalsNone,True andFalse are comparedusing theis operator.

Capture Patterns

Syntax:

capture_pattern: !"_" NAME

The single underscore (_) is not a capture pattern (this is what!"_" expresses). It is treated as awildcard pattern.

A capture pattern always succeeds. It binds the subject value to thename using the scoping rules for name binding established for thewalrus operator inPEP 572. (Summary: the name becomes a localvariable in the closest containing function scope unless there’s anapplicablenonlocal orglobal statement.)

In a given pattern, a given name may be bound only once. Thisdisallows for examplecasex,x:... but allowscase[x]|x:....

Wildcard Pattern

Syntax:

wildcard_pattern:"_"

A wildcard pattern always succeeds. It binds no name.

Value Patterns

Syntax:

value_pattern:attrattr:name_or_attr'.'NAMEname_or_attr:attr|NAME

The dotted name in the pattern is looked up using the standard Pythonname resolution rules. However, when the same value pattern occursmultiple times in the same match statement, the interpreter may cachethe first value found and reuse it, rather than repeat the samelookup. (To clarify, this cache is strictly tied to a given executionof a given match statement.)

The pattern succeeds if the value found thus compares equal to thesubject value (using the== operator).

Group Patterns

Syntax:

group_pattern:'('pattern')'

(For the syntax ofpattern, see Patterns above. Note that itcontains no comma – a parenthesized series of items with at least onecomma is a sequence pattern, as is().)

A parenthesized pattern has no additional syntax. It allows users toadd parentheses around patterns to emphasize the intended grouping.

Sequence Patterns

Syntax:

sequence_pattern:  | '[' [maybe_sequence_pattern] ']'  | '(' [open_sequence_pattern] ')'open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]maybe_sequence_pattern: ','.maybe_star_pattern+ ','?maybe_star_pattern: star_pattern | patternstar_pattern: '*' (capture_pattern | wildcard_pattern)

(Note that a single parenthesized pattern without a trailing comma isa group pattern, not a sequence pattern. However a single patternenclosed in[...] is still a sequence pattern.)

There is no semantic difference between a sequence pattern using[...], a sequence pattern using(...), and an open sequencepattern.

A sequence pattern may contain at most one star subpattern. The starsubpattern may occur in any position. If no star subpattern ispresent, the sequence pattern is a fixed-length sequence pattern;otherwise it is a variable-length sequence pattern.

For a sequence pattern to succeed the subject must be a sequence,where being a sequence is defined as its class being one of the following:

  • a class that inherits fromcollections.abc.Sequence
  • a Python class that has been registered as acollections.abc.Sequence
  • a builtin class that has itsPy_TPFLAGS_SEQUENCE bit set
  • a class that inherits from any of the above (including classes definedbefore aparent’sSequence registration)

The following standard library classes will have theirPy_TPFLAGS_SEQUENCEbit set:

  • array.array
  • collections.deque
  • list
  • memoryview
  • range
  • tuple

Note

Althoughstr,bytes, andbytearray are usuallyconsidered sequences, they are not included in the above list and donot match sequence patterns.

A fixed-length sequence pattern fails if the length of the subjectsequence is not equal to the number of subpatterns.

A variable-length sequence pattern fails if the length of the subjectsequence is less than the number of non-star subpatterns.

The length of the subject sequence is obtained using the builtinlen() function (i.e., via the__len__ protocol). However, theinterpreter may cache this value in a similar manner as described forvalue patterns.

A fixed-length sequence pattern matches the subpatterns tocorresponding items of the subject sequence, from left to right.Matching stops (with a failure) as soon as a subpattern fails. If allsubpatterns succeed in matching their corresponding item, the sequencepattern succeeds.

A variable-length sequence pattern first matches the leading non-starsubpatterns to the corresponding items of the subject sequence, as fora fixed-length sequence. If this succeeds, the star subpatternmatches a list formed of the remaining subject items, with itemsremoved from the end corresponding to the non-star subpatternsfollowing the star subpattern. The remaining non-star subpatterns arethen matched to the corresponding subject items, as for a fixed-lengthsequence.

Mapping Patterns

Syntax:

mapping_pattern: '{' [items_pattern] '}'items_pattern: ','.key_value_pattern+ ','?key_value_pattern:    | (literal_pattern | value_pattern) ':' pattern    | double_star_patterndouble_star_pattern: '**' capture_pattern

(Note that**_ is disallowed by this syntax.)

A mapping pattern may contain at most one double star pattern,and it must be last.

A mapping pattern may not contain duplicate key values.(If all key patterns are literal patterns this is considered asyntax error; otherwise this is a runtime error and willraiseValueError.)

For a mapping pattern to succeed the subject must be a mapping,where being a mapping is defined as its class being one of the following:

  • a class that inherits fromcollections.abc.Mapping
  • a Python class that has been registered as acollections.abc.Mapping
  • a builtin class that has itsPy_TPFLAGS_MAPPING bit set
  • a class that inherits from any of the above (including classes definedbefore aparent’sMapping registration)

The standard library classesdict andmappingproxy will have theirPy_TPFLAGS_MAPPINGbit set.

A mapping pattern succeeds if every key given in the mapping patternis present in the subject mapping, and the pattern foreach key matches the corresponding item of the subject mapping. Keysare always compared with the== operator. If a'**'NAME form is present, that name is bound to adict containingremaining key-value pairs from the subject mapping.

If duplicate keys are detected in the mapping pattern, the pattern isconsidered invalid, and aValueError is raised.

Key-value pairs are matched using the two-argument form of thesubject’sget() method. As a consequence, matched key-value pairsmust already be present in the mapping, and not created on-the-fly by__missing__ or__getitem__. For example,collections.defaultdict instances will only be matched by patternswith keys that were already present when the match statement wasentered.

Class Patterns

Syntax:

class_pattern:    | name_or_attr '(' [pattern_arguments ','?] ')'pattern_arguments:    | positional_patterns [',' keyword_patterns]    | keyword_patternspositional_patterns: ','.pattern+keyword_patterns: ','.keyword_pattern+keyword_pattern: NAME '=' pattern

A class pattern may not repeat the same keyword multiple times.

Ifname_or_attr is not an instance of the builtintype,TypeError is raised.

A class pattern fails if the subject is not an instance ofname_or_attr.This is tested usingisinstance().

If no arguments are present, the pattern succeeds if theisinstance()check succeeds. Otherwise:

  • If only keyword patterns are present, they are processed as follows,one by one:
    • The keyword is looked up as an attribute on the subject.
      • If this raises an exception other thanAttributeError,the exception bubbles up.
      • If this raisesAttributeError the class pattern fails.
      • Otherwise, the subpattern associated with the keyword is matchedagainst the attribute value. If this fails, the class pattern fails.If it succeeds, the match proceeds to the next keyword.
    • If all keyword patterns succeed, the class pattern as a whole succeeds.
  • If any positional patterns are present, they are converted to keywordpatterns (see below) and treated as additional keyword patterns,preceding the syntactic keyword patterns (if any).

Positional patterns are converted to keyword patterns using the__match_args__ attribute on the class designated byname_or_attr,as follows:

  • For a number of built-in types (specified below),a single positional subpattern is accepted which will matchthe entire subject. (Keyword patterns work as for other types here.)
  • The equivalent ofgetattr(cls,"__match_args__",())) is called.
  • If this raises an exception the exception bubbles up.
  • If the returned value is not a tuple, the conversion failsandTypeError is raised.
  • If there are more positional patterns than the length of__match_args__ (as obtained usinglen()),TypeError is raised.
  • Otherwise, positional patterni is converted to a keyword patternusing__match_args__[i] as the keyword,provided it the latter is a string;if it is not,TypeError is raised.
  • For duplicate keywords,TypeError is raised.

Once the positional patterns have been converted to keyword patterns,the match proceeds as if there were only keyword patterns.

As mentioned above, for the following built-in types the handling ofpositional subpatterns is different:bool,bytearray,bytes,dict,float,frozenset,int,list,set,str, andtuple.

This behavior is roughly equivalent to the following:

classC:__match_args__=("__match_self_prop__",)@propertydef__match_self_prop__(self):returnself

Side Effects and Undefined Behavior

The only side-effect produced explicitly by the matching process isthe binding of names. However, the process relies on attributeaccess, instance checks,len(), equality and item access on thesubject and some of its components. It also evaluates valuepatterns and the class name of class patterns. While none of thosetypically create any side-effects, in theory they could. Thisproposal intentionally leaves out any specification of what methodsare called or how many times. This behavior is therefore undefinedand user code should not rely on it.

Another undefined behavior is the binding of variables by capturepatterns that are followed (in the same case block) by another patternthat fails. These may happen earlier or later depending on theimplementation strategy, the only constraint being that capturevariables must be set before guards that use them explicitly areevaluated. If a guard consists of anand clause, evaluation ofthe operands may even be interspersed with pattern matching, as longas left-to-right evaluation order is maintained.

The Standard Library

To facilitate the use of pattern matching, several changes will bemade to the standard library:

  • Namedtuples and dataclasses will have auto-generated__match_args__.
  • For dataclasses the order of attributes in the generated__match_args__ will be the same as the order of correspondingarguments in the generated__init__() method. This includes thesituations where attributes are inherited from a superclass. Fieldswithinit=False are excluded from__match_args__.

In addition, a systematic effort will be put into going throughexisting standard library classes and adding__match_args__ whereit looks beneficial.

Appendix A – Full Grammar

Here is the full grammar formatch_stmt. This is an additionalalternative forcompound_stmt. Remember thatmatch andcase are soft keywords, i.e. they are not reserved words in othergrammatical contexts (including at the start of a line if there is nocolon where expected). By convention, hard keywords use single quoteswhile soft keywords use double quotes.

Other notation used beyond standard EBNF:

  • SEP.RULE+ is shorthand forRULE(SEPRULE)*
  • !RULE is a negative lookahead assertion
match_stmt: "match" subject_expr ':' NEWLINE INDENT case_block+ DEDENTsubject_expr:    | star_named_expression ',' [star_named_expressions]    | named_expressioncase_block: "case" patterns [guard] ':' blockguard: 'if' named_expressionpatterns: open_sequence_pattern | patternpattern: as_pattern | or_patternas_pattern: or_pattern 'as' capture_patternor_pattern: '|'.closed_pattern+closed_pattern:    | literal_pattern    | capture_pattern    | wildcard_pattern    | value_pattern    | group_pattern    | sequence_pattern    | mapping_pattern    | class_patternliteral_pattern:    | signed_number !('+' | '-')    | signed_number '+' NUMBER    | signed_number '-' NUMBER    | strings    | 'None'    | 'True'    | 'False'signed_number: NUMBER | '-' NUMBERcapture_pattern: !"_" NAME !('.' | '(' | '=')wildcard_pattern: "_"value_pattern: attr !('.' | '(' | '=')attr: name_or_attr '.' NAMEname_or_attr: attr | NAMEgroup_pattern: '(' pattern ')'sequence_pattern:  | '[' [maybe_sequence_pattern] ']'  | '(' [open_sequence_pattern] ')'open_sequence_pattern: maybe_star_pattern ',' [maybe_sequence_pattern]maybe_sequence_pattern: ','.maybe_star_pattern+ ','?maybe_star_pattern: star_pattern | patternstar_pattern: '*' (capture_pattern | wildcard_pattern)mapping_pattern: '{' [items_pattern] '}'items_pattern: ','.key_value_pattern+ ','?key_value_pattern:    | (literal_pattern | value_pattern) ':' pattern    | double_star_patterndouble_star_pattern: '**' capture_patternclass_pattern:    | name_or_attr '(' [pattern_arguments ','?] ')'pattern_arguments:    | positional_patterns [',' keyword_patterns]    | keyword_patternspositional_patterns: ','.pattern+keyword_patterns: ','.keyword_pattern+keyword_pattern: NAME '=' pattern

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0634.rst

Last modified:2023-12-11 05:40:56 GMT


[8]ページ先頭

©2009-2025 Movatter.jp