email.headerregistry
:自訂標頭物件¶
原始碼:Lib/email/headerregistry.py
在 3.6 版被加入:[1]
Headers are represented by customized subclasses ofstr
. Theparticular class used to represent a given header is determined by theheader_factory
of thepolicy
ineffect when the headers are created. This section documents the particularheader_factory
implemented by the email package for handlingRFC 5322compliant email messages, which not only provides customized header objects forvarious header types, but also provides an extension mechanism for applicationsto add their own custom header types.
When using any of the policy objects derived fromEmailPolicy
, all headers are produced byHeaderRegistry
and haveBaseHeader
as their last baseclass. Each header class has an additional base class that is determined bythe type of the header. For example, many headers have the classUnstructuredHeader
as their other base class. The specialized secondclass for a header is determined by the name of the header, using a lookuptable stored in theHeaderRegistry
. All of this is managedtransparently for the typical application program, but interfaces are providedfor modifying the default behavior for use by more complex applications.
The sections below first document the header base classes and their attributes,followed by the API for modifying the behavior ofHeaderRegistry
, andfinally the support classes used to represent the data parsed from structuredheaders.
- classemail.headerregistry.BaseHeader(name,value)¶
name andvalue are passed to
BaseHeader
from theheader_factory
call. The string value ofany header object is thevalue fully decoded to unicode.This base class defines the following read-only properties:
- name¶
The name of the header (the portion of the field before the ':'). Thisis exactly the value passed in the
header_factory
call forname; thatis, case is preserved.
- defects¶
A tuple of
HeaderDefect
instances reporting anyRFC compliance problems found during parsing. The email package tries tobe complete about detecting compliance issues. See theerrors
module for a discussion of the types of defects that may be reported.
- max_count¶
The maximum number of headers of this type that can have the same
name
. A value ofNone
means unlimited. TheBaseHeader
valuefor this attribute isNone
; it is expected that specialized headerclasses will override this value as needed.
BaseHeader
also provides the following method, which is called by theemail library code and should not in general be called by applicationprograms:- fold(*,policy)¶
Return a string containing
linesep
characters as required to correctly fold the header according topolicy. Acte_type
of8bit
will betreated as if it were7bit
, since headers may not contain arbitrarybinary data. Ifutf8
isFalse
,non-ASCII data will beRFC 2047 encoded.
BaseHeader
by itself cannot be used to create a header object. Itdefines a protocol that each specialized header cooperates with in order toproduce the header object. Specifically,BaseHeader
requires thatthe specialized class provide aclassmethod()
namedparse
. Thismethod is called as follows:parse(string,kwds)
kwds
is a dictionary containing one pre-initialized key,defects
.defects
is an empty list. The parse method should append any detecteddefects to this list. On return, thekwds
dictionarymust containvalues for at least the keysdecoded
anddefects
.decoded
should be the string value for the header (that is, the header value fullydecoded to unicode). The parse method should assume thatstring maycontain content-transfer-encoded parts, but should correctly handle all validunicode characters as well so that it can parse un-encoded header values.BaseHeader
's__new__
then creates the header instance, and calls itsinit
method. The specialized class only needs to provide aninit
method if it wishes to set additional attributes beyond those provided byBaseHeader
itself. Such aninit
method should look like this:definit(self,/,*args,**kw):self._myattr=kw.pop('myattr')super().init(*args,**kw)
That is, anything extra that the specialized class puts in to the
kwds
dictionary should be removed and handled, and the remaining contents ofkw
(andargs
) passed to theBaseHeader
init
method.
- classemail.headerregistry.UnstructuredHeader¶
An "unstructured" header is the default type of header inRFC 5322.Any header that does not have a specified syntax is treated asunstructured. The classic example of an unstructured header is theSubject header.
InRFC 5322, an unstructured header is a run of arbitrary text in theASCII character set.RFC 2047, however, has anRFC 5322 compatiblemechanism for encoding non-ASCII text as ASCII characters within a headervalue. When avalue containing encoded words is passed to theconstructor, the
UnstructuredHeader
parser converts such encoded wordsinto unicode, following theRFC 2047 rules for unstructured text. Theparser uses heuristics to attempt to decode certain non-compliant encodedwords. Defects are registered in such cases, as well as defects for issuessuch as invalid characters within the encoded words or the non-encoded text.This header type provides no additional attributes.
- classemail.headerregistry.DateHeader¶
RFC 5322 specifies a very specific format for dates within email headers.The
DateHeader
parser recognizes that date format, as well asrecognizing a number of variant forms that are sometimes found "in thewild".This header type provides the following additional attributes:
- datetime¶
If the header value can be recognized as a valid date of one form oranother, this attribute will contain a
datetime
instance representing that date. If the timezone of the input date isspecified as-0000
(indicating it is in UTC but contains noinformation about the source timezone), thendatetime
will be anaivedatetime
. If a specific timezone offset isfound (including+0000
), thendatetime
will contain an awaredatetime
that usesdatetime.timezone
to record the timezoneoffset.
The
decoded
value of the header is determined by formatting thedatetime
according to theRFC 5322 rules; that is, it is set to:email.utils.format_datetime(self.datetime)
When creating a
DateHeader
,value may bedatetime
instance. This means, for example, thatthe following code is valid and does what one would expect:msg['Date']=datetime(2011,7,15,21)
Because this is a naive
datetime
it will be interpreted as a UTCtimestamp, and the resulting value will have a timezone of-0000
. Muchmore useful is to use thelocaltime()
function from theutils
module:msg['Date']=utils.localtime()
This example sets the date header to the current time and date usingthe current timezone offset.
- classemail.headerregistry.AddressHeader¶
Address headers are one of the most complex structured header types.The
AddressHeader
class provides a generic interface to any addressheader.This header type provides the following additional attributes:
- groups¶
A tuple of
Group
objects encoding theaddresses and groups found in the header value. Addresses that arenot part of a group are represented in this list as single-addressGroups
whosedisplay_name
isNone
.
- addresses¶
A tuple of
Address
objects encoding allof the individual addresses from the header value. If the header valuecontains any groups, the individual addresses from the group are includedin the list at the point where the group occurs in the value (that is,the list of addresses is "flattened" into a one dimensional list).
The
decoded
value of the header will have all encoded words decoded tounicode.idna
encoded domain names are also decoded tounicode. Thedecoded
value is set byjoining thestr
value of the elements of thegroups
attribute with','
.A list of
Address
andGroup
objects in any combinationmay be used to set the value of an address header.Group
objects whosedisplay_name
isNone
will be interpreted as single addresses, whichallows an address list to be copied with groups intact by using the listobtained from thegroups
attribute of the source header.
- classemail.headerregistry.SingleAddressHeader¶
A subclass of
AddressHeader
that adds oneadditional attribute:- address¶
The single address encoded by the header value. If the header valueactually contains more than one address (which would be a violation ofthe RFC under the default
policy
), accessing this attributewill result in aValueError
.
Many of the above classes also have aUnique
variant (for example,UniqueUnstructuredHeader
). The only difference is that in theUnique
variant,max_count
is set to 1.
- classemail.headerregistry.MIMEVersionHeader¶
There is really only one valid value for theMIME-Versionheader, and that is
1.0
. For future proofing, this header classsupports other valid version numbers. If a version number has a valid valueperRFC 2045, then the header object will have non-None
values forthe following attributes:- version¶
The version number as a string, with any whitespace and/or commentsremoved.
- major¶
The major version number as an integer
- minor¶
The minor version number as an integer
- classemail.headerregistry.ParameterizedMIMEHeader¶
MIME headers all start with the prefix 'Content-'. Each specific header hasa certain value, described under the class for that header. Some canalso take a list of supplemental parameters, which have a common format.This class serves as a base for all the MIME headers that take parameters.
- params¶
A dictionary mapping parameter names to parameter values.
- classemail.headerregistry.ContentTypeHeader¶
A
ParameterizedMIMEHeader
class that handles theContent-Type header.- content_type¶
The content type string, in the form
maintype/subtype
.
- maintype¶
- subtype¶
- classemail.headerregistry.ContentDispositionHeader¶
A
ParameterizedMIMEHeader
class that handles theContent-Disposition header.- content_disposition¶
inline
andattachment
are the only valid values in common use.
- classemail.headerregistry.ContentTransferEncoding¶
Handles theContent-Transfer-Encoding header.
- classemail.headerregistry.HeaderRegistry(base_class=BaseHeader,default_class=UnstructuredHeader,use_default_map=True)¶
This is the factory used by
EmailPolicy
by default.HeaderRegistry
builds the class used to create a header instancedynamically, usingbase_class and a specialized class retrieved from aregistry that it holds. When a given header name does not appear in theregistry, the class specified bydefault_class is used as the specializedclass. Whenuse_default_map isTrue
(the default), the standardmapping of header names to classes is copied in to the registry duringinitialization.base_class is always the last class in the generatedclass's__bases__
list.The default mappings are:
- subject:
UniqueUnstructuredHeader
- date:
UniqueDateHeader
- resent-date:
DateHeader
- orig-date:
UniqueDateHeader
- sender:
UniqueSingleAddressHeader
- resent-sender:
SingleAddressHeader
- to:
UniqueAddressHeader
- resent-to:
AddressHeader
- cc:
UniqueAddressHeader
- resent-cc:
AddressHeader
- bcc:
UniqueAddressHeader
- resent-bcc:
AddressHeader
- from:
UniqueAddressHeader
- resent-from:
AddressHeader
- reply-to:
UniqueAddressHeader
- mime-version:
MIMEVersionHeader
- content-type:
ContentTypeHeader
- content-disposition:
ContentDispositionHeader
- content-transfer-encoding:
ContentTransferEncodingHeader
- message-id:
MessageIDHeader
HeaderRegistry
has the following methods:- map_to_type(self,name,cls)¶
name is the name of the header to be mapped. It will be converted tolower case in the registry.cls is the specialized class to be used,along withbase_class, to create the class used to instantiate headersthat matchname.
- __getitem__(name)¶
Construct and return a class to handle creating aname header.
- __call__(name,value)¶
Retrieves the specialized header associated withname from theregistry (usingdefault_class ifname does not appear in theregistry) and composes it withbase_class to produce a class,calls the constructed class's constructor, passing it the sameargument list, and finally returns the class instance created thereby.
The following classes are the classes used to represent data parsed fromstructured headers and can, in general, be used by an application program toconstruct structured values to assign to specific headers.
- classemail.headerregistry.Address(display_name='',username='',domain='',addr_spec=None)¶
The class used to represent an email address. The general form of anaddress is:
[display_name]<username@domain>
或是:
username@domain
where each part must conform to specific syntax rules spelled out inRFC 5322.
As a convenienceaddr_spec can be specified instead ofusername anddomain, in which caseusername anddomain will be parsed from theaddr_spec. Anaddr_spec must be a properly RFC quoted string; if it isnot
Address
will raise an error. Unicode characters are allowed andwill be property encoded when serialized. However, per the RFCs, unicode isnot allowed in the username portion of the address.- display_name¶
The display name portion of the address, if any, with all quotingremoved. If the address does not have a display name, this attributewill be an empty string.
- username¶
The
username
portion of the address, with all quoting removed.
- domain¶
The
domain
portion of the address.
- addr_spec¶
The
username@domain
portion of the address, correctly quotedfor use as a bare address (the second form shown above). Thisattribute is not mutable.
- __str__()¶
The
str
value of the object is the address quoted according toRFC 5322 rules, but with no Content Transfer Encoding of any non-ASCIIcharacters.
To support SMTP (RFC 5321),
Address
handles one special case: ifusername
anddomain
are both the empty string (orNone
), thenthe string value of theAddress
is<>
.
- classemail.headerregistry.Group(display_name=None,addresses=None)¶
The class used to represent an address group. The general form of anaddress group is:
display_name:[address-list];
As a convenience for processing lists of addresses that consist of a mixtureof groups and single addresses, a
Group
may also be used to representsingle addresses that are not part of a group by settingdisplay_name toNone
and providing a list of the single address asaddresses.- display_name¶
The
display_name
of the group. If it isNone
and there isexactly oneAddress
inaddresses
, then theGroup
represents asingle address that is not in a group.
註解
[1]Originally added in 3.3 as aprovisional module