email.policy: Policy Objects

在 3.3 版被加入.

原始碼:Lib/email/policy.py


Theemail package's prime focus is the handling of email messages asdescribed by the various email and MIME RFCs. However, the general format ofemail messages (a block of header fields each consisting of a name followed bya colon followed by a value, the whole block followed by a blank line and anarbitrary 'body'), is a format that has found utility outside of the realm ofemail. Some of these uses conform fairly closely to the main email RFCs, somedo not. Even when working with email, there are times when it is desirable tobreak strict compliance with the RFCs, such as generating emails thatinteroperate with email servers that do not themselves follow the standards, orthat implement extensions you want to use in ways that violate thestandards.

Policy objects give the email package the flexibility to handle all thesedisparate use cases.

APolicy object encapsulates a set of attributes and methods thatcontrol the behavior of various components of the email package during use.Policy instances can be passed to various classes and methods in theemail package to alter the default behavior. The settable values and theirdefaults are described below.

There is a default policy used by all classes in the email package. For all oftheparser classes and the related convenience functions, and fortheMessage class, this is theCompat32policy, via its corresponding pre-defined instancecompat32. Thispolicy provides for complete backward compatibility (in some cases, includingbug compatibility) with the pre-Python3.3 version of the email package.

This default value for thepolicy keyword toEmailMessage is theEmailPolicy policy, viaits pre-defined instancedefault.

When aMessage orEmailMessageobject is created, it acquires a policy. If the message is created by aparser, a policy passed to the parser will be the policy used bythe message it creates. If the message is created by the program, then thepolicy can be specified when it is created. When a message is passed to agenerator, the generator uses the policy from the message bydefault, but you can also pass a specific policy to the generator that willoverride the one stored on the message object.

The default value for thepolicy keyword for theemail.parser classesand the parser convenience functionswill be changing in a future version ofPython. Therefore you shouldalways specify explicitly which policy you wantto use when calling any of the classes and functions described in theparser module.

The first part of this documentation covers the features ofPolicy, anabstract base class that defines the features that are common to allpolicy objects, includingcompat32. This includes certain hookmethods that are called internally by the email package, which a custom policycould override to obtain different behavior. The second part describes theconcrete classesEmailPolicy andCompat32, which implementthe hooks that provide the standard behavior and the backward compatiblebehavior and features, respectively.

Policy instances are immutable, but they can be cloned, accepting thesame keyword arguments as the class constructor and returning a newPolicy instance that is a copy of the original but with the specifiedattributes values changed.

As an example, the following code could be used to read an email message from afile on disk and pass it to the systemsendmail program on a Unix system:

>>>fromemailimportmessage_from_binary_file>>>fromemail.generatorimportBytesGenerator>>>fromemailimportpolicy>>>fromsubprocessimportPopen,PIPE>>>withopen('mymsg.txt','rb')asf:...msg=message_from_binary_file(f,policy=policy.default)...>>>p=Popen(['sendmail',msg['To'].addresses[0]],stdin=PIPE)>>>g=BytesGenerator(p.stdin,policy=msg.policy.clone(linesep='\r\n'))>>>g.flatten(msg)>>>p.stdin.close()>>>rc=p.wait()

Here we are tellingBytesGenerator to use the RFCcorrect line separator characters when creating the binary string to feed intosendmail'sstdin, where the default policy would use\n lineseparators.

Some email package methods accept apolicy keyword argument, allowing thepolicy to be overridden for that method. For example, the following code usestheas_bytes() method of themsg object fromthe previous example and writes the message to a file using the native lineseparators for the platform on which it is running:

>>>importos>>>withopen('converted.txt','wb')asf:...f.write(msg.as_bytes(policy=msg.policy.clone(linesep=os.linesep)))17

Policy objects can also be combined using the addition operator, producing apolicy object whose settings are a combination of the non-default values of thesummed objects:

>>>compat_SMTP=policy.compat32.clone(linesep='\r\n')>>>compat_strict=policy.compat32.clone(raise_on_defect=True)>>>compat_strict_SMTP=compat_SMTP+compat_strict

This operation is not commutative; that is, the order in which the objects areadded matters. To illustrate:

>>>policy100=policy.compat32.clone(max_line_length=100)>>>policy80=policy.compat32.clone(max_line_length=80)>>>apolicy=policy100+policy80>>>apolicy.max_line_length80>>>apolicy=policy80+policy100>>>apolicy.max_line_length100
classemail.policy.Policy(**kw)

This is theabstract base class for all policy classes. It providesdefault implementations for a couple of trivial methods, as well as theimplementation of the immutability property, theclone() method, andthe constructor semantics.

The constructor of a policy class can be passed various keyword arguments.The arguments that may be specified are any non-method properties on thisclass, plus any additional non-method properties on the concrete class. Avalue specified in the constructor will override the default value for thecorresponding attribute.

This class defines the following properties, and thus values for thefollowing may be passed in the constructor of any policy class:

max_line_length

The maximum length of any line in the serialized output, not counting theend of line character(s). Default is 78, perRFC 5322. A value of0 orNone indicates that no line wrapping should bedone at all.

linesep

The string to be used to terminate lines in serialized output. Thedefault is\n because that's the internal end-of-line discipline usedby Python, though\r\n is required by the RFCs.

cte_type

Controls the type of Content Transfer Encodings that may be or arerequired to be used. The possible values are:

7bit

all data must be "7 bit clean" (ASCII-only). This means thatwhere necessary data will be encoded using eitherquoted-printable or base64 encoding.

8bit

data is not constrained to be 7 bit clean. Data in headers isstill required to be ASCII-only and so will be encoded (seefold_binary() andutf8 below forexceptions), but body parts may use the8bit CTE.

Acte_type value of8bit only works withBytesGenerator, notGenerator, because strings cannot contain binary data. If aGenerator is operating under a policy that specifiescte_type=8bit, it will act as ifcte_type is7bit.

raise_on_defect

IfTrue, any defects encountered will be raised as errors. IfFalse (the default), defects will be passed to theregister_defect() method.

mangle_from_

IfTrue, lines starting with"From " in the body areescaped by putting a> in front of them. This parameter is used whenthe message is being serialized by a generator.Default:False.

在 3.5 版被加入.

message_factory

A factory function for constructing a new empty message object. Usedby the parser when building messages. Defaults toNone, inwhich caseMessage is used.

在 3.6 版被加入.

verify_generated_headers

IfTrue (the default), the generator will raiseHeaderWriteError instead of writing a headerthat is improperly folded or delimited, such that it wouldbe parsed as multiple headers or joined with adjacent data.Such headers can be generated by custom header classes or bugsin theemail module.

As it's a security feature, this defaults toTrue even in theCompat32 policy.For backwards compatible, but unsafe, behavior, it must be set toFalse explicitly.

在 3.13 版被加入.

The followingPolicy method is intended to be called by code usingthe email library to create policy instances with custom settings:

clone(**kw)

Return a newPolicy instance whose attributes have the samevalues as the current instance, except where those attributes aregiven new values by the keyword arguments.

The remainingPolicy methods are called by the email package code,and are not intended to be called by an application using the email package.A custom policy must implement all of these methods.

handle_defect(obj,defect)

Handle adefect found onobj. When the email package calls thismethod,defect will always be a subclass ofMessageDefect.

The default implementation checks theraise_on_defect flag. Ifit isTrue,defect is raised as an exception. If it isFalse(the default),obj anddefect are passed toregister_defect().

register_defect(obj,defect)

Register adefect onobj. In the email package,defect will alwaysbe a subclass ofMessageDefect.

The default implementation calls theappend method of thedefectsattribute ofobj. When the email package callshandle_defect,obj will normally have adefects attribute that has anappendmethod. Custom object types used with the email package (for example,customMessage objects) should also provide such an attribute,otherwise defects in parsed messages will raise unexpected errors.

header_max_count(name)

Return the maximum allowed number of headers namedname.

Called when a header is added to anEmailMessageorMessage object. If the returned value is not0 orNone, and there are already a number of headers with thenamename greater than or equal to the value returned, aValueError is raised.

Because the default behavior ofMessage.__setitem__ is to append thevalue to the list of headers, it is easy to create duplicate headerswithout realizing it. This method allows certain headers to be limitedin the number of instances of that header that may be added to aMessage programmatically. (The limit is not observed by the parser,which will faithfully produce as many headers as exist in the messagebeing parsed.)

The default implementation returnsNone for all header names.

header_source_parse(sourcelines)

The email package calls this method with a list of strings, each stringending with the line separation characters found in the source beingparsed. The first line includes the field header name and separator.All whitespace in the source is preserved. The method should return the(name,value) tuple that is to be stored in theMessage torepresent the parsed header.

If an implementation wishes to retain compatibility with the existingemail package policies,name should be the case preserved name (allcharacters up to the ':' separator), whilevalue should be theunfolded value (all line separator characters removed, but whitespacekept intact), stripped of leading whitespace.

sourcelines may contain surrogateescaped binary data.

There is no default implementation

header_store_parse(name,value)

The email package calls this method with the name and value provided bythe application program when the application program is modifying aMessage programmatically (as opposed to aMessage created by aparser). The method should return the(name,value) tuple that is tobe stored in theMessage to represent the header.

If an implementation wishes to retain compatibility with the existingemail package policies, thename andvalue should be strings orstring subclasses that do not change the content of the passed inarguments.

There is no default implementation

header_fetch_parse(name,value)

The email package calls this method with thename andvalue currentlystored in theMessage when that header is requested by theapplication program, and whatever the method returns is what is passedback to the application as the value of the header being retrieved.Note that there may be more than one header with the same name stored intheMessage; the method is passed the specific name and value of theheader destined to be returned to the application.

value may contain surrogateescaped binary data. There should be nosurrogateescaped binary data in the value returned by the method.

There is no default implementation

fold(name,value)

The email package calls this method with thename andvalue currentlystored in theMessage for a given header. The method should return astring that represents that header "folded" correctly (according to thepolicy settings) by composing thename with thevalue and insertinglinesep characters at the appropriate places. SeeRFC 5322for a discussion of the rules for folding email headers.

value may contain surrogateescaped binary data. There should be nosurrogateescaped binary data in the string returned by the method.

fold_binary(name,value)

The same asfold(), except that the returned value should be abytes object rather than a string.

value may contain surrogateescaped binary data. These could beconverted back into binary data in the returned bytes object.

classemail.policy.EmailPolicy(**kw)

This concretePolicy provides behavior that is intended to be fullycompliant with the current email RFCs. These include (but are not limitedto)RFC 5322,RFC 2047, and the current MIME RFCs.

This policy adds new header parsing and folding algorithms. Instead ofsimple strings, headers arestr subclasses with attributes that dependon the type of the field. The parsing and folding algorithm fully implementRFC 2047 andRFC 5322.

The default value for themessage_factoryattribute isEmailMessage.

In addition to the settable attributes listed above that apply to allpolicies, this policy adds the following additional attributes:

在 3.6 版被加入:[1]

utf8

IfFalse, followRFC 5322, supporting non-ASCII characters inheaders by encoding them as "encoded words". IfTrue, followRFC 6532 and useutf-8 encoding for headers. Messagesformatted in this way may be passed to SMTP servers that supporttheSMTPUTF8 extension (RFC 6531).

refold_source

If the value for a header in theMessage object originated from aparser (as opposed to being set by a program), thisattribute indicates whether or not a generator should refold that valuewhen transforming the message back into serialized form. The possiblevalues are:

none

all source values use original folding

long

source values that have any line that is longer thanmax_line_length will be refolded

all

all values are refolded.

預設為long

header_factory

A callable that takes two arguments,name andvalue, wherename is a header field name andvalue is an unfolded header fieldvalue, and returns a string subclass that represents that header. Adefaultheader_factory (seeheaderregistry) is providedthat supports custom parsing for the various address and dateRFC 5322header field types, and the major MIME header field stypes. Support foradditional custom parsing will be added in the future.

content_manager

An object with at least two methods: get_content and set_content. Whentheget_content() orset_content() method of anEmailMessage object is called, it calls thecorresponding method of this object, passing it the message object as itsfirst argument, and any arguments or keywords that were passed to it asadditional arguments. By defaultcontent_manager is set toraw_data_manager.

在 3.4 版被加入.

The class provides the following concrete implementations of the abstractmethods ofPolicy:

header_max_count(name)

Returns the value of themax_count attribute of thespecialized class used to represent the header with the given name.

header_source_parse(sourcelines)

The name is parsed as everything up to the ':' and returnedunmodified. The value is determined by stripping leading whitespace offthe remainder of the first line, joining all subsequent lines together,and stripping any trailing carriage return or linefeed characters.

header_store_parse(name,value)

The name is returned unchanged. If the input value has anameattribute and it matchesname ignoring case, the value is returnedunchanged. Otherwise thename andvalue are passed toheader_factory, and the resulting header object is returned asthe value. In this case aValueError is raised if the input valuecontains CR or LF characters.

header_fetch_parse(name,value)

If the value has aname attribute, it is returned to unmodified.Otherwise thename, and thevalue with any CR or LF charactersremoved, are passed to theheader_factory, and the resultingheader object is returned. Any surrogateescaped bytes get turned intothe unicode unknown-character glyph.

fold(name,value)

Header folding is controlled by therefold_source policy setting.A value is considered to be a 'source value' if and only if it does nothave aname attribute (having aname attribute means it is aheader object of some sort). If a source value needs to be refoldedaccording to the policy, it is converted into a header object bypassing thename and thevalue with any CR and LF characters removedto theheader_factory. Folding of a header object is done bycalling itsfold method with the current policy.

Source values are split into lines usingsplitlines(). Ifthe value is not to be refolded, the lines are rejoined using thelinesep from the policy and returned. The exception is linescontaining non-ascii binary data. In that case the value is refoldedregardless of therefold_source setting, which causes the binary datato be CTE encoded using theunknown-8bit charset.

fold_binary(name,value)

The same asfold() ifcte_type is7bit, exceptthat the returned value is bytes.

Ifcte_type is8bit, non-ASCII binary data isconverted backinto bytes. Headers with binary data are not refolded, regardless of therefold_header setting, since there is no way to know whether thebinary data consists of single byte characters or multibyte characters.

The following instances ofEmailPolicy provide defaults suitable forspecific application domains. Note that in the future the behavior of theseinstances (in particular theHTTP instance) may be adjusted to conform evenmore closely to the RFCs relevant to their domains.

email.policy.default

An instance ofEmailPolicy with all defaults unchanged. This policyuses the standard Python\n line endings rather than the RFC-correct\r\n.

email.policy.SMTP

Suitable for serializing messages in conformance with the email RFCs.Likedefault, but withlinesep set to\r\n, which is RFCcompliant.

email.policy.SMTPUTF8

The same asSMTP except thatutf8 isTrue.Useful for serializing messages to a message store without using encodedwords in the headers. Should only be used for SMTP transmission if thesender or recipient addresses have non-ASCII characters (thesmtplib.SMTP.send_message() method handles this automatically).

email.policy.HTTP

Suitable for serializing headers with for use in HTTP traffic. LikeSMTP except thatmax_line_length is set toNone (unlimited).

email.policy.strict

Convenience instance. The same asdefault except thatraise_on_defect is set toTrue. This allows any policy to be madestrict by writing:

somepolicy+policy.strict

With all of theseEmailPolicies, the effective API ofthe email package is changed from the Python 3.2 API in the following ways:

  • Setting a header on aMessage results in thatheader being parsed and a header object created.

  • Fetching a header value from aMessage resultsin that header being parsed and a header object created andreturned.

  • Any header object, or any header that is refolded due to thepolicy settings, is folded using an algorithm that fully implements theRFC folding algorithms, including knowing where encoded words are requiredand allowed.

From the application view, this means that any header obtained through theEmailMessage is a header object with extraattributes, whose string value is the fully decoded unicode value of theheader. Likewise, a header may be assigned a new value, or a new headercreated, using a unicode string, and the policy will take care of convertingthe unicode string into the correct RFC encoded form.

The header objects and their attributes are described inheaderregistry.

classemail.policy.Compat32(**kw)

This concretePolicy is the backward compatibility policy. Itreplicates the behavior of the email package in Python 3.2. Thepolicy module also defines an instance of this class,compat32, that is used as the default policy. Thus the defaultbehavior of the email package is to maintain compatibility with Python 3.2.

The following attributes have values that are different from thePolicy default:

mangle_from_

The default isTrue.

The class provides the following concrete implementations of theabstract methods ofPolicy:

header_source_parse(sourcelines)

The name is parsed as everything up to the ':' and returnedunmodified. The value is determined by stripping leading whitespace offthe remainder of the first line, joining all subsequent lines together,and stripping any trailing carriage return or linefeed characters.

header_store_parse(name,value)

The name and value are returned unmodified.

header_fetch_parse(name,value)

If the value contains binary data, it is converted into aHeader object using theunknown-8bit charset.Otherwise it is returned unmodified.

fold(name,value)

Headers are folded using theHeader foldingalgorithm, which preserves existing line breaks in the value, and wrapseach resulting line to themax_line_length. Non-ASCII binary data areCTE encoded using theunknown-8bit charset.

fold_binary(name,value)

Headers are folded using theHeader foldingalgorithm, which preserves existing line breaks in the value, and wrapseach resulting line to themax_line_length. Ifcte_type is7bit, non-ascii binary data is CTE encoded using theunknown-8bitcharset. Otherwise the original source header is used, with its existingline breaks and any (RFC invalid) binary data it may contain.

email.policy.compat32

An instance ofCompat32, providing backward compatibility with thebehavior of the email package in Python 3.2.

註解

[1]

Originally added in 3.3 as aprovisional feature.