Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 681 – Data Class Transforms

Author:
Erik De Bonte <erikd at microsoft.com>,Eric Traut <erictr at microsoft.com>
Sponsor:
Jelle Zijlstra <jelle.zijlstra at gmail.com>
Discussions-To:
Typing-SIG thread
Status:
Final
Type:
Standards Track
Topic:
Typing
Created:
02-Dec-2021
Python-Version:
3.11
Post-History:
24-Apr-2021,13-Dec-2021,22-Feb-2022
Resolution:
Python-Dev message

Table of Contents

Important

This PEP is a historical document: seeThe dataclass_transform decorator and@typing.dataclass_transform for up-to-date specs and documentation. Canonical typing specs are maintained at thetyping specs site; runtime typing behaviour is described in the CPython documentation.

×

See thetyping specification update process for how to propose changes to the typing spec.

Abstract

PEP 557 introduced the dataclass to the Python stdlib. Several popularlibraries have behaviors that are similar to dataclasses, but thesebehaviors cannot be described using standard type annotations. Suchprojects include attrs, pydantic, and object relational mapper (ORM)packages such as SQLAlchemy and Django.

Most type checkers, linters and language servers have full support fordataclasses. This proposal aims to generalize this functionality andprovide a way for third-party libraries to indicate that certaindecorator functions, classes, and metaclasses provide behaviorssimilar to dataclasses.

These behaviors include:

  • Synthesizing an__init__ method based on declareddata fields.
  • Optionally synthesizing__eq__,__ne__,__lt__,__le__,__gt__ and__ge__ methods.
  • Supporting “frozen” classes, a way to enforce immutability duringstatic type checking.
  • Supporting “field specifiers”, which describe attributes ofindividual fields that a static type checker must be aware of,such as whether a default value is provided for the field.

The full behavior of the stdlib dataclass is described in thePythondocumentation.

This proposal does not affect CPython directly except for the additionof adataclass_transform decorator intyping.py.

Motivation

There is no existing, standard way for libraries with dataclass-likesemantics to declare their behavior to type checkers. To work aroundthis limitation, Mypy custom plugins have been developed for many ofthese libraries, but these plugins don’t work with other typecheckers, linters or language servers. They are also costly tomaintain for library authors, and they require that Python developersknow about the existence of these plugins and download and configurethem within their environment.

Rationale

The intent of this proposal is not to support every feature of everylibrary with dataclass-like semantics, but rather to make it possibleto use the most common features of these libraries in a way that iscompatible with static type checking. If a user values these librariesand also values static type checking, they may need to avoid usingcertain features or make small adjustments to the way they use them.That’s already true for the Mypy custom plugins, whichdon’t support every feature of every dataclass-like library.

As new features are added to dataclasses in the future, we intend, whenappropriate, to add support for those features ondataclass_transform as well. Keeping these two feature sets insync will make it easier for dataclass users to understand and usedataclass_transform and will simplify the maintenance of dataclasssupport in type checkers.

Additionally, we will consider addingdataclass_transform supportin the future for features that have been adopted by multiplethird-party libraries but are not supported by dataclasses.

Specification

Thedataclass_transform decorator

This specification introduces a new decorator function inthetyping module nameddataclass_transform. This decoratorcan be applied to either a function that is itself a decorator,a class, or a metaclass. The presence ofdataclass_transform tells a static type checker that the decoratedfunction, class, or metaclass performs runtime “magic” that transformsa class, endowing it with dataclass-like behaviors.

Ifdataclass_transform is applied to a function, using the decoratedfunction as a decorator is assumed to apply dataclass-like semantics.If the function has overloads, thedataclass_transform decorator canbe applied to the implementation of the function or any one, but not morethan one, of the overloads. When applied to an overload, thedataclass_transform decorator still impacts all usage of thefunction.

Ifdataclass_transform is applied to a class, dataclass-likesemantics will be assumed for any class that directly or indirectlyderives from the decorated class or uses the decorated class as ametaclass. Attributes on the decorated class and its base classesare not considered to be fields.

Examples of each approach are shown in the following sections. Eachexample creates aCustomerModel class with dataclass-like semantics.The implementation of the decorated objects is omitted for brevity,but we assume that they modify classes in the following ways:

  • They synthesize an__init__ method using data fields declaredwithin the class and its parent classes.
  • They synthesize__eq__ and__ne__ methods.

Type checkers supporting this PEP will recognize that theCustomerModel class can be instantiated using the synthesized__init__ method:

# Using positional argumentsc1=CustomerModel(327,"John Smith")# Using keyword argumentsc2=CustomerModel(id=327,name="John Smith")# These calls will generate runtime errors and should be flagged as# errors by a static type checker.c3=CustomerModel()c4=CustomerModel(327,first_name="John")c5=CustomerModel(327,"John Smith",0)

Decorator function example

_T=TypeVar("_T")# The ``create_model`` decorator is defined by a library.# This could be in a type stub or inline.@typing.dataclass_transform()defcreate_model(cls:Type[_T])->Type[_T]:cls.__init__=...cls.__eq__=...cls.__ne__=...returncls# The ``create_model`` decorator can now be used to create new model# classes, like this:@create_modelclassCustomerModel:id:intname:str

Class example

# The ``ModelBase`` class is defined by a library. This could be in# a type stub or inline.@typing.dataclass_transform()classModelBase:...# The ``ModelBase`` class can now be used to create new model# subclasses, like this:classCustomerModel(ModelBase):id:intname:str

Metaclass example

# The ``ModelMeta`` metaclass and ``ModelBase`` class are defined by# a library. This could be in a type stub or inline.@typing.dataclass_transform()classModelMeta(type):...classModelBase(metaclass=ModelMeta):...# The ``ModelBase`` class can now be used to create new model# subclasses, like this:classCustomerModel(ModelBase):id:intname:str

Decorator function and class/metaclass parameters

A decorator function, class, or metaclass that provides dataclass-likefunctionality may accept parameters that modify certain behaviors.This specification defines the following parameters that static typecheckers must honor if they are used by a dataclass transform. Each ofthese parameters accepts a bool argument, and it must be possible forthe bool value (True orFalse) to be statically evaluated.

  • eq,order,frozen,init andunsafe_hash are parameterssupported in the stdlib dataclass, with meanings defined inPEP 557.
  • kw_only,match_args andslots are parameters supportedin the stdlib dataclass, first introduced in Python 3.10.

dataclass_transform parameters

Parameters todataclass_transform allow for some basiccustomization of default behaviors:

_T=TypeVar("_T")defdataclass_transform(*,eq_default:bool=True,order_default:bool=False,kw_only_default:bool=False,field_specifiers:tuple[type|Callable[...,Any],...]=(),**kwargs:Any,)->Callable[[_T],_T]:...
  • eq_default indicates whether theeq parameter is assumed tobe True or False if it is omitted by the caller. If not specified,eq_default will default to True (the default assumption fordataclass).
  • order_default indicates whether theorder parameter isassumed to be True or False if it is omitted by the caller. If notspecified,order_default will default to False (the defaultassumption for dataclass).
  • kw_only_default indicates whether thekw_only parameter isassumed to be True or False if it is omitted by the caller. If notspecified,kw_only_default will default to False (the defaultassumption for dataclass).
  • field_specifiers specifies a static list of supported classesthat describe fields. Some libraries also supply functions toallocate instances of field specifiers, and those functions mayalso be specified in this tuple. If not specified,field_specifiers will default to an empty tuple (no fieldspecifiers supported). The standard dataclass behavior supportsonly one type of field specifier calledField plus a helperfunction (field) that instantiates this class, so if we weredescribing the stdlib dataclass behavior, we would provide thetuple argument(dataclasses.Field,dataclasses.field).
  • kwargs allows arbitrary additional keyword args to be passed todataclass_transform. This gives type checkers the freedom tosupport experimental parameters without needing to wait for changesintyping.py. Type checkers should report errors for anyunrecognized parameters.

In the future, we may add additional parameters todataclass_transform as needed to support common behaviors in usercode. These additions will be made after reaching consensus ontyping-sig rather than via additional PEPs.

The following sections provide additional examples showing how theseparameters are used.

Decorator function example

# Indicate that the ``create_model`` function assumes keyword-only# parameters for the synthesized ``__init__`` method unless it is# invoked with ``kw_only=False``. It always synthesizes order-related# methods and provides no way to override this behavior.@typing.dataclass_transform(kw_only_default=True,order_default=True)defcreate_model(*,frozen:bool=False,kw_only:bool=True,)->Callable[[Type[_T]],Type[_T]]:...# Example of how this decorator would be used by code that imports# from this library:@create_model(frozen=True,kw_only=False)classCustomerModel:id:intname:str

Class example

# Indicate that classes that derive from this class default to# synthesizing comparison methods.@typing.dataclass_transform(eq_default=True,order_default=True)classModelBase:def__init_subclass__(cls,*,init:bool=True,frozen:bool=False,eq:bool=True,order:bool=True,):...# Example of how this class would be used by code that imports# from this library:classCustomerModel(ModelBase,init=False,frozen=True,eq=False,order=False,):id:intname:str

Metaclass example

# Indicate that classes that use this metaclass default to# synthesizing comparison methods.@typing.dataclass_transform(eq_default=True,order_default=True)classModelMeta(type):def__new__(cls,name,bases,namespace,*,init:bool=True,frozen:bool=False,eq:bool=True,order:bool=True,):...classModelBase(metaclass=ModelMeta):...# Example of how this class would be used by code that imports# from this library:classCustomerModel(ModelBase,init=False,frozen=True,eq=False,order=False,):id:intname:str

Field specifiers

Most libraries that support dataclass-like semantics provide one ormore “field specifier” types that allow a class definition to provideadditional metadata about each field in the class. This metadata candescribe, for example, default values, or indicate whether the fieldshould be included in the synthesized__init__ method.

Field specifiers can be omitted in cases where additional metadata isnot required:

@dataclassclassEmployee:# Field with no specifiername:str# Field that uses field specifier class instanceage:Optional[int]=field(default=None,init=False)# Field with type annotation and simple initializer to# describe default valueis_paid_hourly:bool=True# Not a field (but rather a class variable) because type# annotation is not provided.office_number="unassigned"

Field specifier parameters

Libraries that support dataclass-like semantics and support fieldspecifier classes typically use common parameter names to constructthese field specifiers. This specification formalizes the names andmeanings of the parameters that must be understood for static typecheckers. These standardized parameters must be keyword-only.

These parameters are a superset of those supported bydataclasses.field, excluding those that do not have an impact ontype checking such ascompare andhash.

Field specifier classes are allowed to use otherparameters in their constructors, and those parameters can bepositional and may use other names.

  • init is an optional bool parameter that indicates whether thefield should be included in the synthesized__init__ method. Ifunspecified,init defaults to True. Field specifier functionscan use overloads that implicitly specify the value ofinitusing a literal bool value type(Literal[False] orLiteral[True]).
  • default is an optional parameter that provides the default valuefor the field.
  • default_factory is an optional parameter that provides a runtimecallback that returns the default value for the field. If neitherdefault nordefault_factory are specified, the field isassumed to have no default value and must be provided a value whenthe class is instantiated.
  • factory is an alias fordefault_factory. Stdlib dataclassesuse the namedefault_factory, but attrs uses the namefactoryin many scenarios, so this alias is necessary for supporting attrs.
  • kw_only is an optional bool parameter that indicates whether thefield should be marked as keyword-only. If true, the field will bekeyword-only. If false, it will not be keyword-only. If unspecified,the value of thekw_only parameter on the object decorated withdataclass_transform will be used, or if that is unspecified, thevalue ofkw_only_default ondataclass_transform will be used.
  • alias is an optional str parameter that provides an alternativename for the field. This alternative name is used in the synthesized__init__ method.

It is an error to specify more than one ofdefault,default_factory andfactory.

This example demonstrates the above:

# Library code (within type stub or inline)# In this library, passing a resolver means that init must be False,# and the overload with Literal[False] enforces that.@overloaddefmodel_field(*,default:Optional[Any]=...,resolver:Callable[[],Any],init:Literal[False]=False,)->Any:...@overloaddefmodel_field(*,default:Optional[Any]=...,resolver:None=None,init:bool=True,)->Any:...@typing.dataclass_transform(kw_only_default=True,field_specifiers=(model_field,))defcreate_model(*,init:bool=True,)->Callable[[Type[_T]],Type[_T]]:...# Code that imports this library:@create_model(init=False)classCustomerModel:id:int=model_field(resolver=lambda:0)name:str

Runtime behavior

At runtime, thedataclass_transform decorator’s only effect is toset an attribute named__dataclass_transform__ on the decoratedfunction or class to support introspection. The value of the attributeshould be a dict mapping the names of thedataclass_transformparameters to their values.

For example:

{"eq_default":True,"order_default":False,"kw_only_default":False,"field_specifiers":(),"kwargs":{}}

Dataclass semantics

Except where stated otherwise in this PEP, classes impacted bydataclass_transform, either by inheriting from a class that isdecorated withdataclass_transform or by being decorated witha function decorated withdataclass_transform, are assumed tobehave like stdlibdataclass.

This includes, but is not limited to, the following semantics:

  • Frozen dataclasses cannot inherit from non-frozen dataclasses. Aclass that has been decorated withdataclass_transform isconsidered neither frozen nor non-frozen, thus allowing frozenclasses to inherit from it. Similarly, a class that directlyspecifies a metaclass that is decorated withdataclass_transformis considered neither frozen nor non-frozen.

    Consider these class examples:

    # ModelBase is not considered either "frozen" or "non-frozen"# because it is decorated with ``dataclass_transform``@typing.dataclass_transform()classModelBase():...# Vehicle is considered non-frozen because it does not specify# "frozen=True".classVehicle(ModelBase):name:str# Car is a frozen class that derives from Vehicle, which is a# non-frozen class. This is an error.classCar(Vehicle,frozen=True):wheel_count:int

    And these similar metaclass examples:

    @typing.dataclass_transform()classModelMeta(type):...# ModelBase is not considered either "frozen" or "non-frozen"# because it directly specifies ModelMeta as its metaclass.classModelBase(metaclass=ModelMeta):...# Vehicle is considered non-frozen because it does not specify# "frozen=True".classVehicle(ModelBase):name:str# Car is a frozen class that derives from Vehicle, which is a# non-frozen class. This is an error.classCar(Vehicle,frozen=True):wheel_count:int
  • Field ordering and inheritance is assumed to follow the rulesspecified in557. This includes the effects ofoverrides (redefining a field in a child class that has already beendefined in a parent class).
  • PEP 557 indicates thatall fields without default values must appear beforefields with default values. Although not explicitlystated in PEP 557, this rule is ignored wheninit=False, andthis specification likewise ignores this requirement in thatsituation. Likewise, there is no need to enforce this ordering whenkeyword-only parameters are used for__init__, so the rule isnot enforced ifkw_only semantics are in effect.
  • As withdataclass, method synthesis is skipped if it wouldoverwrite a method that is explicitly declared within the class.Method declarations on base classes do not cause method synthesis tobe skipped.

    For example, if a class declares an__init__ method explicitly,an__init__ method will not be synthesized for that class.

  • KW_ONLY sentinel values are supported as described inthe Pythondocs andbpo-43532.
  • ClassVar attributes are not considered dataclass fields and areignored by dataclass mechanisms.

Undefined behavior

If multipledataclass_transform decorators are found, either on asingle function (including its overloads), a single class, or within aclass hierarchy, the resulting behavior is undefined. Library authorsshould avoid these scenarios.

Reference Implementation

Pyright contains the reference implementation of typechecker support fordataclass_transform. Pyright’sdataClasses.tssource file would be a goodstarting point for understanding the implementation.

Theattrs andpydanticlibraries are usingdataclass_transform and serve as real-worldexamples of its usage.

Rejected Ideas

auto_attribs parameter

The attrs library supports anauto_attribs parameter thatindicates whether class members decorated withPEP 526 variableannotations but with no assignment should be treated as data fields.

We considered supportingauto_attribs and a correspondingauto_attribs_default parameter, but decided against this because itis specific to attrs.

Django does not support declaring fields using type annotations only,so Django users who leveragedataclass_transform should be awarethat they should always supply assigned values.

cmp parameter

The attrs library supports a bool parametercmp that is equivalentto setting botheq andorder to True. We chose not to supportacmp parameter, since it only applies to attrs. Users can emulatethecmp behaviour by using theeq andorder parameter namesinstead.

Automatic field name aliasing

The attrs library performsautomatic aliasing offield names that start with a single underscore, stripping theunderscore from the name of the corresponding__init__ parameter.

This proposal omits that behavior since it is specific to attrs. Userscan manually alias these fields using thealias parameter.

Alternate field ordering algorithms

The attrs library currently supports two approaches to ordering thefields within a class:

  • Dataclass order: The same ordering used by dataclasses. This is thedefault behavior of the older APIs (e.g.attr.s).
  • Method Resolution Order (MRO): This is the default behavior of thenewer APIs (e.g. define, mutable, frozen). Older APIs (e.g.attr.s)can opt into this behavior by specifyingcollect_by_mro=True.

The resulting field orderings can differ in certain diamond-shapedmultiple inheritance scenarios.

For simplicity, this proposal does not support any field orderingother than that used by dataclasses.

Fields redeclared in subclasses

The attrs library differs from stdlib dataclasses in how ithandles inherited fields that are redeclared in subclasses. Thedataclass specification preserves the original order, but attrsdefines a new order based on subclasses.

For simplicity, we chose to only support the dataclass behavior.Users of attrs who rely on the attrs-specific ordering will not seethe expected order of parameters in the synthesized__init__method.

Django primary and foreign keys

Django appliesadditional logic for primary and foreign keys. For example, it automatically adds anid field(and__init__ parameter) if there is no field designated as aprimary key.

As this is not broadly applicable to dataclass libraries, thisadditional logic is not accommodated with this proposal, sousers of Django would need to explicitly declare theid field.

Class-wide default values

SQLAlchemy requested that we expose a way to specify that the defaultvalue of all fields in the transformed class isNone. It is typicalthat all SQLAlchemy fields are optional, andNone indicates thatthe field is not set.

We chose not to support this feature, since it is specific toSQLAlchemy. Users can manually setdefault=None on these fieldsinstead.

Descriptor-typed field support

We considered adding a boolean parameter ondataclass_transformto enable better support for fields with descriptor types, which iscommon in SQLAlchemy. When enabled, the type of each parameter on thesynthesized__init__ method corresponding to a descriptor-typedfield would be the type of the value parameter to the descriptor’s__set__ method rather than the descriptor type itself. Similarly,when setting the field, the__set__ value type would be expected.And when getting the value of the field, its type would be expected tomatch the return type of__get__.

This idea was based on the belief thatdataclass did not properlysupport descriptor-typed fields. In fact it does, but type checkers(at least mypy and pyright) did not reflect the runtime behavior whichled to our misunderstanding. For more details, see thePyright bug.

converter field specifier parameter

The attrs library supports aconverter field specifier parameter,which is aCallable that is called by the generated__init__ method to convert the supplied value to some otherdesired value. This is tricky to support since the parameter type inthe synthesized__init__ method needs to accept uncovered values,but the resulting field is typed according to the output of theconverter.

Some aspects of this issue are detailed in aPyright discussion.

There may be no good way to support this because there’s not enoughinformation to derive the type of the input parameter. One possiblesolution would be to add support for aconverter field specifierparameter but then use theAny type for the correspondingparameter in the__init__ method.

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0681.rst

Last modified:2025-02-01 07:28:42 GMT


[8]ページ先頭

©2009-2025 Movatter.jp