19.1.1.email.message: Representing an email message¶
Source code:Lib/email/message.py
The central class in theemail package is theMessage class,imported from theemail.message module. It is the base class for theemail object model.Message provides the core functionality forsetting and querying header fields, and for accessing message bodies.
Conceptually, aMessage object consists ofheaders andpayloads.Headers areRFC 2822 style field names and values where the field name andvalue are separated by a colon. The colon is not part of either the field nameor the field value.
Headers are stored and returned in case-preserving form but are matchedcase-insensitively. There may also be a single envelope header, also known astheUnix-From header or theFrom_ header. The payload is either a stringin the case of simple message objects or a list ofMessage objects forMIME container documents (e.g.multipart/* andmessage/rfc822).
Message objects provide a mapping style interface for accessing themessage headers, and an explicit interface for accessing both the headers andthe payload. It provides convenience methods for generating a flat textrepresentation of the message object tree, for accessing commonly used headerparameters, and for recursively walking over the object tree.
Here are the methods of theMessage class:
- class
email.message.Message(policy=compat32)¶ Ifpolicy is specified (it must be an instance of a
policyclass) use the rules it specifies to update and serialize the representationof the message. Ifpolicy is not set, use thecompat32policy, which maintains backward compatibility withthe Python 3.2 version of the email package. For more information see thepolicydocumentation.Changed in version 3.3:Thepolicy keyword argument was added.
as_string(unixfrom=False,maxheaderlen=0,policy=None)¶Return the entire message flattened as a string. When optionalunixfromis true, the envelope header is included in the returned string.unixfrom defaults to
False. For backward compatibility reasons,maxheaderlen defaults to0, so if you want a different value youmust override it explicitly (the value specified formax_line_length inthe policy will be ignored by this method). Thepolicy argument may beused to override the default policy obtained from the message instance.This can be used to control some of the formatting produced by themethod, since the specifiedpolicy will be passed to theGenerator.Flattening the message may trigger changes to the
Messageifdefaults need to be filled in to complete the transformation to a string(for example, MIME boundaries may be generated or modified).Note that this method is provided as a convenience and may not alwaysformat the message the way you want. For example, by default it doesnot do the mangling of lines that begin with
Fromthat isrequired by the unix mbox format. For more flexibility, instantiate aGeneratorinstance and use itsflatten()method directly. For example:fromioimportStringIOfromemail.generatorimportGeneratorfp=StringIO()g=Generator(fp,mangle_from_=True,maxheaderlen=60)g.flatten(msg)text=fp.getvalue()
If the message object contains binary data that is not encoded accordingto RFC standards, the non-compliant data will be replaced by unicode“unknown character” code points. (See also
as_bytes()andBytesGenerator.)Changed in version 3.4:thepolicy keyword argument was added.
__str__()¶Equivalent to
as_string(). Allowsstr(msg)to produce astring containing the formatted message.
as_bytes(unixfrom=False,policy=None)¶Return the entire message flattened as a bytes object. When optionalunixfrom is true, the envelope header is included in the returnedstring.unixfrom defaults to
False. Thepolicy argument may beused to override the default policy obtained from the message instance.This can be used to control some of the formatting produced by themethod, since the specifiedpolicy will be passed to theBytesGenerator.Flattening the message may trigger changes to the
Messageifdefaults need to be filled in to complete the transformation to a string(for example, MIME boundaries may be generated or modified).Note that this method is provided as a convenience and may not alwaysformat the message the way you want. For example, by default it doesnot do the mangling of lines that begin with
Fromthat isrequired by the unix mbox format. For more flexibility, instantiate aBytesGeneratorinstance and use itsflatten()method directly.For example:fromioimportBytesIOfromemail.generatorimportBytesGeneratorfp=BytesIO()g=BytesGenerator(fp,mangle_from_=True,maxheaderlen=60)g.flatten(msg)text=fp.getvalue()
New in version 3.4.
__bytes__()¶Equivalent to
as_bytes(). Allowsbytes(msg)to produce abytes object containing the formatted message.New in version 3.4.
is_multipart()¶Return
Trueif the message’s payload is a list of sub-Messageobjects, otherwise returnFalse. Whenis_multipart()returnsFalse, the payload should be a stringobject. (Note thatis_multipart()returningTruedoes notnecessarily mean that “msg.get_content_maintype() == ‘multipart’” willreturn theTrue. For example,is_multipartwill returnTruewhen theMessageis of typemessage/rfc822.)
set_unixfrom(unixfrom)¶Set the message’s envelope header tounixfrom, which should be a string.
get_unixfrom()¶Return the message’s envelope header. Defaults to
Noneif theenvelope header was never set.
attach(payload)¶Add the givenpayload to the current payload, which must be
Noneora list ofMessageobjects before the call. After the call, thepayload will always be a list ofMessageobjects. If you want toset the payload to a scalar object (e.g. a string), useset_payload()instead.
get_payload(i=None,decode=False)¶Return the current payload, which will be a list of
Messageobjects whenis_multipart()isTrue, or astring whenis_multipart()isFalse. If the payload is a listand you mutate the list object, you modify the message’s payload in place.With optional argumenti,
get_payload()will return thei-thelement of the payload, counting from zero, ifis_multipart()isTrue. AnIndexErrorwill be raised ifi is less than 0 orgreater than or equal to the number of items in the payload. If thepayload is a string (i.e.is_multipart()isFalse) andi isgiven, aTypeErroris raised.Optionaldecode is a flag indicating whether the payload should bedecoded or not, according to theContent-Transfer-Encodingheader. When
Trueand the message is not a multipart, the payload willbe decoded if this header’s value isquoted-printableorbase64.If some other encoding is used, orContent-Transfer-Encodingheader is missing, the payload isreturned as-is (undecoded). In all cases the returned value is binarydata. If the message is a multipart and thedecode flag isTrue,thenNoneis returned. If the payload is base64 and it was notperfectly formed (missing padding, characters outside the base64alphabet), then an appropriate defect will be added to the message’sdefect property (InvalidBase64PaddingDefectorInvalidBase64CharactersDefect, respectively).Whendecode is
False(the default) the body is returned as a stringwithout decoding theContent-Transfer-Encoding. However,for aContent-Transfer-Encoding of 8bit, an attempt is madeto decode the original bytes using thecharsetspecified by theContent-Type header, using thereplaceerror handler.If nocharsetis specified, or if thecharsetgiven is notrecognized by the email package, the body is decoded using the defaultASCII charset.
set_payload(payload,charset=None)¶Set the entire message object’s payload topayload. It is the client’sresponsibility to ensure the payload invariants. Optionalcharset setsthe message’s default character set; see
set_charset()for details.
set_charset(charset)¶Set the character set of the payload tocharset, which can either be a
Charsetinstance (seeemail.charset), astring naming a character set, orNone. If it is a string, it willbe converted to aCharsetinstance. IfcharsetisNone, thecharsetparameter will be removed from theContent-Type header (the message will not be otherwisemodified). Anything else will generate aTypeError.If there is no existingMIME-Version header one will beadded. If there is no existingContent-Type header, onewill be added with a value oftext/plain. Whether theContent-Type header already exists or not, its
charsetparameter will be set tocharset.output_charset. Ifcharset.input_charset andcharset.output_charset differ, the payloadwill be re-encoded to theoutput_charset. If there is no existingContent-Transfer-Encoding header, then the payload will betransfer-encoded, if needed, using the specifiedCharset, and a header with the appropriate valuewill be added. If aContent-Transfer-Encoding headeralready exists, the payload is assumed to already be correctly encodedusing thatContent-Transfer-Encoding and is not modified.
The following methods implement a mapping-like interface for accessing themessage’sRFC 2822 headers. Note that there are some semantic differencesbetween these methods and a normal mapping (i.e. dictionary) interface. Forexample, in a dictionary there are no duplicate keys, but here there may beduplicate message headers. Also, in dictionaries there is no guaranteedorder to the keys returned by
keys(), but in aMessageobject,headers are always returned in the order they appeared in the originalmessage, or were added to the message later. Any header deleted and thenre-added are always appended to the end of the header list.These semantic differences are intentional and are biased toward maximalconvenience.
Note that in all cases, any envelope header present in the message is notincluded in the mapping interface.
In a model generated from bytes, any header values that (in contravention ofthe RFCs) contain non-ASCII bytes will, when retrieved through thisinterface, be represented as
Headerobjects witha charset ofunknown-8bit.__len__()¶Return the total number of headers, including duplicates.
__contains__(name)¶Return true if the message object has a field namedname. Matching isdone case-insensitively andname should not include the trailing colon.Used for the
inoperator, e.g.:if'message-id'inmyMessage:print('Message-ID:',myMessage['message-id'])
__getitem__(name)¶Return the value of the named header field.name should not include thecolon field separator. If the header is missing,
Noneis returned; aKeyErroris never raised.Note that if the named field appears more than once in the message’sheaders, exactly which of those field values will be returned isundefined. Use the
get_all()method to get the values of all theextant named headers.
__setitem__(name,val)¶Add a header to the message with field namename and valueval. Thefield is appended to the end of the message’s existing fields.
Note that this doesnot overwrite or delete any existing header with the samename. If you want to ensure that the new header is the only one present in themessage with field namename, delete the field first, e.g.:
delmsg['subject']msg['subject']='Python roolz!'
__delitem__(name)¶Delete all occurrences of the field with namename from the message’sheaders. No exception is raised if the named field isn’t present in theheaders.
keys()¶Return a list of all the message’s header field names.
values()¶Return a list of all the message’s field values.
items()¶Return a list of 2-tuples containing all the message’s field headers andvalues.
get(name,failobj=None)¶Return the value of the named header field. This is identical to
__getitem__()except that optionalfailobj is returned if thenamed header is missing (defaults toNone).
Here are some additional useful methods:
get_all(name,failobj=None)¶Return a list of all the values for the field namedname. If there areno such named headers in the message,failobj is returned (defaults to
None).
add_header(_name,_value,**_params)¶Extended header setting. This method is similar to
__setitem__()except that additional header parameters can be provided as keywordarguments._name is the header field to add and_value is theprimary value for the header.For each item in the keyword argument dictionary_params, the key istaken as the parameter name, with underscores converted to dashes (sincedashes are illegal in Python identifiers). Normally, the parameter willbe added as
key="value"unless the value isNone, in which caseonly the key will be added. If the value contains non-ASCII characters,it can be specified as a three tuple in the format(CHARSET,LANGUAGE,VALUE), whereCHARSETis a string naming thecharset to be used to encode the value,LANGUAGEcan usually be settoNoneor the empty string (seeRFC 2231 for other possibilities),andVALUEis the string value containing non-ASCII code points. Ifa three tuple is not passed and the value contains non-ASCII characters,it is automatically encoded inRFC 2231 format using aCHARSETofutf-8and aLANGUAGEofNone.Here’s an example:
msg.add_header('Content-Disposition','attachment',filename='bud.gif')
This will add a header that looks like
Content-Disposition:attachment;filename="bud.gif"
An example with non-ASCII characters:
msg.add_header('Content-Disposition','attachment',filename=('iso-8859-1','','Fußballer.ppt'))
Which produces
Content-Disposition:attachment;filename*="iso-8859-1''Fu%DFballer.ppt"
replace_header(_name,_value)¶Replace a header. Replace the first header found in the message thatmatches_name, retaining header order and field name case. If nomatching header was found, a
KeyErroris raised.
get_content_type()¶Return the message’s content type. The returned string is coerced tolower case of the formmaintype/subtype. If there was noContent-Type header in the message the default type as givenby
get_default_type()will be returned. Since according toRFC 2045, messages always have a default type,get_content_type()will always return a value.RFC 2045 defines a message’s default type to betext/plainunless it appears inside amultipart/digest container, inwhich case it would bemessage/rfc822. If theContent-Type header has an invalid type specification,RFC 2045 mandates that the default type betext/plain.
get_content_maintype()¶Return the message’s main content type. This is themaintypepart of the string returned by
get_content_type().
get_content_subtype()¶Return the message’s sub-content type. This is thesubtypepart of the string returned by
get_content_type().
get_default_type()¶Return the default content type. Most messages have a default contenttype oftext/plain, except for messages that are subparts ofmultipart/digest containers. Such subparts have a defaultcontent type ofmessage/rfc822.
set_default_type(ctype)¶Set the default content type.ctype should either betext/plain ormessage/rfc822, although this is notenforced. The default content type is not stored in theContent-Type header.
get_params(failobj=None,header='content-type',unquote=True)¶Return the message’sContent-Type parameters, as a list.The elements of the returned list are 2-tuples of key/value pairs, assplit on the
'='sign. The left hand side of the'='is the key,while the right hand side is the value. If there is no'='sign inthe parameter the value is the empty string, otherwise the value is asdescribed inget_param()and is unquoted if optionalunquote isTrue(the default).Optionalfailobj is the object to return if there is noContent-Type header. Optionalheader is the header tosearch instead ofContent-Type.
get_param(param,failobj=None,header='content-type',unquote=True)¶Return the value of theContent-Type header’s parameterparam as a string. If the message has noContent-Typeheader or if there is no such parameter, thenfailobj is returned(defaults to
None).Optionalheader if given, specifies the message header to use instead ofContent-Type.
Parameter keys are always compared case insensitively. The return valuecan either be a string, or a 3-tuple if the parameter wasRFC 2231encoded. When it’s a 3-tuple, the elements of the value are of the form
(CHARSET,LANGUAGE,VALUE). Note that bothCHARSETandLANGUAGEcan beNone, in which case you should considerVALUEto be encoded in theus-asciicharset. You can usually ignoreLANGUAGE.If your application doesn’t care whether the parameter was encoded as inRFC 2231, you can collapse the parameter value by calling
email.utils.collapse_rfc2231_value(), passing in the return valuefromget_param(). This will return a suitably decoded Unicodestring when the value is a tuple, or the original string unquoted if itisn’t. For example:rawparam=msg.get_param('foo')param=email.utils.collapse_rfc2231_value(rawparam)
In any case, the parameter value (either the returned string, or the
VALUEitem in the 3-tuple) is always unquoted, unlessunquote is settoFalse.
set_param(param,value,header='Content-Type',requote=True,charset=None,language='',replace=False)¶Set a parameter in theContent-Type header. If theparameter already exists in the header, its value will be replaced withvalue. If theContent-Type header as not yet been definedfor this message, it will be set totext/plain and the newparameter value will be appended as perRFC 2045.
Optionalheader specifies an alternative header toContent-Type, and all parameters will be quoted as necessaryunless optionalrequote is
False(the default isTrue).If optionalcharset is specified, the parameter will be encodedaccording toRFC 2231. Optionallanguage specifies the RFC 2231language, defaulting to the empty string. Bothcharset andlanguageshould be strings.
Ifreplace is
False(the default) the header is moved to theend of the list of headers. Ifreplace isTrue, the headerwill be updated in place.Changed in version 3.4:
replacekeyword was added.
del_param(param,header='content-type',requote=True)¶Remove the given parameter completely from theContent-Typeheader. The header will be re-written in place without the parameter orits value. All values will be quoted as necessary unlessrequote is
False(the default isTrue). Optionalheader specifies analternative toContent-Type.
set_type(type,header='Content-Type',requote=True)¶Set the main type and subtype for theContent-Typeheader.type must be a string in the formmaintype/subtype,otherwise a
ValueErroris raised.This method replaces theContent-Type header, keeping allthe parameters in place. Ifrequote is
False, this leaves theexisting header’s quoting as is, otherwise the parameters will be quoted(the default).An alternative header can be specified in theheader argument. When theContent-Type header is set aMIME-Versionheader is also added.
get_filename(failobj=None)¶Return the value of the
filenameparameter of theContent-Disposition header of the message. If the headerdoes not have afilenameparameter, this method falls back to lookingfor thenameparameter on theContent-Type header. Ifneither is found, or the header is missing, thenfailobj is returned.The returned string will always be unquoted as peremail.utils.unquote().
get_boundary(failobj=None)¶Return the value of the
boundaryparameter of theContent-Type header of the message, orfailobj if eitherthe header is missing, or has noboundaryparameter. The returnedstring will always be unquoted as peremail.utils.unquote().
set_boundary(boundary)¶Set the
boundaryparameter of theContent-Type header toboundary.set_boundary()will always quoteboundary ifnecessary. AHeaderParseErroris raised if themessage object has noContent-Type header.Note that using this method is subtly different than deleting the oldContent-Type header and adding a new one with the newboundary via
add_header(), becauseset_boundary()preservesthe order of theContent-Type header in the list ofheaders. However, it doesnot preserve any continuation lines which mayhave been present in the originalContent-Type header.
get_content_charset(failobj=None)¶Return the
charsetparameter of theContent-Type header,coerced to lower case. If there is noContent-Type header, or ifthat header has nocharsetparameter,failobj is returned.Note that this method differs from
get_charset()which returns theCharsetinstance for the default encoding of the message body.
get_charsets(failobj=None)¶Return a list containing the character set names in the message. If themessage is amultipart, then the list will contain one elementfor each subpart in the payload, otherwise, it will be a list of length 1.
Each item in the list will be a string which is the value of the
charsetparameter in theContent-Type header for therepresented subpart. However, if the subpart has noContent-Type header, nocharsetparameter, or is not ofthetext main MIME type, then that item in the returned listwill befailobj.
get_content_disposition()¶Return the lowercased value (without parameters) of the message’sContent-Disposition header if it has one, or
None. Thepossible values for this method areinline,attachment orNoneif the message followsRFC 2183.New in version 3.5.
walk()¶The
walk()method is an all-purpose generator which can be used toiterate over all the parts and subparts of a message object tree, indepth-first traversal order. You will typically usewalk()as theiterator in aforloop; each iteration returns the next subpart.Here’s an example that prints the MIME type of every part of a multipartmessage structure:
>>>forpartinmsg.walk():...print(part.get_content_type())multipart/reporttext/plainmessage/delivery-statustext/plaintext/plainmessage/rfc822text/plain
walkiterates over the subparts of any part whereis_multipart()returnsTrue, even thoughmsg.get_content_maintype()=='multipart'may returnFalse. Wecan see this in our example by making use of the_structuredebughelper function:>>>forpartinmsg.walk():...print(part.get_content_maintype()=='multipart',...part.is_multipart())True TrueFalse FalseFalse TrueFalse FalseFalse FalseFalse TrueFalse False>>>_structure(msg)multipart/report text/plain message/delivery-status text/plain text/plain message/rfc822 text/plain
Here the
messageparts are notmultiparts, but they do containsubparts.is_multipart()returnsTrueandwalkdescendsinto the subparts.
Messageobjects can also optionally contain two instance attributes,which can be used when generating the plain text of a MIME message.preamble¶The format of a MIME document allows for some text between the blank linefollowing the headers, and the first multipart boundary string. Normally,this text is never visible in a MIME-aware mail reader because it fallsoutside the standard MIME armor. However, when viewing the raw text ofthe message, or when viewing the message in a non-MIME aware reader, thistext can become visible.
Thepreamble attribute contains this leading extra-armor text for MIMEdocuments. When the
Parserdiscovers some textafter the headers but before the first boundary string, it assigns thistext to the message’spreamble attribute. When theGeneratoris writing out the plain textrepresentation of a MIME message, and it finds themessage has apreamble attribute, it will write this text in the areabetween the headers and the first boundary. Seeemail.parserandemail.generatorfor details.Note that if the message object has no preamble, thepreamble attributewill be
None.
epilogue¶Theepilogue attribute acts the same way as thepreamble attribute,except that it contains text that appears between the last boundary andthe end of the message.
You do not need to set the epilogue to the empty string in order for the
Generatorto print a newline at the end of thefile.
defects¶Thedefects attribute contains a list of all the problems found whenparsing this message. See
email.errorsfor a detailed descriptionof the possible parsing defects.
