email.header
:國際化標頭¶
This module is part of the legacy (Compat32
) email API. In the current APIencoding and decoding of headers is handled transparently by thedictionary-like API of theEmailMessage
class. Inaddition to uses in legacy code, this module can be useful in applications thatneed to completely control the character sets used when encoding headers.
The remaining text in this section is the original documentation of the module.
RFC 2822 is the base standard that describes the format of email messages.It derives from the olderRFC 822 standard which came into widespread use ata time when most email was composed of ASCII characters only.RFC 2822 is aspecification written assuming email contains only 7-bit ASCII characters.
Of course, as email has been deployed worldwide, it has becomeinternationalized, such that language specific character sets can now be used inemail messages. The base standard still requires email messages to betransferred using only 7-bit ASCII characters, so a slew of RFCs have beenwritten describing how to encode email containing non-ASCII characters intoRFC 2822-compliant format. These RFCs includeRFC 2045,RFC 2046,RFC 2047, andRFC 2231. Theemail
package supports these standardsin itsemail.header
andemail.charset
modules.
If you want to include non-ASCII characters in your email headers, say in theSubject orTo fields, you should use theHeader
class and assign the field in theMessage
object to an instance ofHeader
instead of using a string for the headervalue. Import theHeader
class from theemail.header
module.For example:
>>>fromemail.messageimportMessage>>>fromemail.headerimportHeader>>>msg=Message()>>>h=Header('p\xf6stal','iso-8859-1')>>>msg['Subject']=h>>>msg.as_string()'Subject: =?iso-8859-1?q?p=F6stal?=\n\n'
Notice here how we wanted theSubject field to contain a non-ASCIIcharacter? We did this by creating aHeader
instance and passing inthe character set that the byte string was encoded in. When the subsequentMessage
instance was flattened, theSubjectfield was properlyRFC 2047 encoded. MIME-aware mail readers would show thisheader using the embedded ISO-8859-1 character.
Here is theHeader
class description:
- classemail.header.Header(s=None,charset=None,maxlinelen=None,header_name=None,continuation_ws='',errors='strict')¶
Create a MIME-compliant header that can contain strings in different charactersets.
Optionals is the initial header value. If
None
(the default), theinitial header value is not set. You can later append to the header withappend()
method calls.s may be an instance ofbytes
orstr
, but see theappend()
documentation for semantics.Optionalcharset serves two purposes: it has the same meaning as thecharsetargument to the
append()
method. It also sets the default character setfor all subsequentappend()
calls that omit thecharset argument. Ifcharset is not provided in the constructor (the default), theus-ascii
character set is used both ass's initial charset and as the default forsubsequentappend()
calls.The maximum line length can be specified explicitly viamaxlinelen. Forsplitting the first line to a shorter value (to account for the field headerwhich isn't included ins, e.g.Subject) pass in the name of thefield inheader_name. The defaultmaxlinelen is 78, and the default valueforheader_name is
None
, meaning it is not taken into account for thefirst line of a long, split header.Optionalcontinuation_ws must beRFC 2822-compliant foldingwhitespace, and is usually either a space or a hard tab character. Thischaracter will be prepended to continuation lines.continuation_wsdefaults to a single space character.
Optionalerrors is passed straight through to the
append()
method.- append(s,charset=None,errors='strict')¶
Append the strings to the MIME header.
Optionalcharset, if given, should be a
Charset
instance (seeemail.charset
) or the name of a character set, whichwill be converted to aCharset
instance. A valueofNone
(the default) means that thecharset given in the constructoris used.s may be an instance of
bytes
orstr
. If it is aninstance ofbytes
, thencharset is the encoding of that bytestring, and aUnicodeError
will be raised if the string cannot bedecoded with that character set.Ifs is an instance of
str
, thencharset is a hint specifyingthe character set of the characters in the string.In either case, when producing anRFC 2822-compliant header usingRFC 2047 rules, the string will be encoded using the output codec ofthe charset. If the string cannot be encoded using the output codec, aUnicodeError will be raised.
Optionalerrors is passed as the errors argument to the decode callifs is a byte string.
- encode(splitchars=';,\t',maxlinelen=None,linesep='\n')¶
Encode a message header into an RFC-compliant format, possibly wrappinglong lines and encapsulating non-ASCII parts in base64 or quoted-printableencodings.
Optionalsplitchars is a string containing characters which should begiven extra weight by the splitting algorithm during normal headerwrapping. This is in very rough support ofRFC 2822's 'higher levelsyntactic breaks': split points preceded by a splitchar are preferredduring line splitting, with the characters preferred in the order inwhich they appear in the string. Space and tab may be included in thestring to indicate whether preference should be given to one over theother as a split point when other split chars do not appear in the linebeing split. Splitchars does not affectRFC 2047 encoded lines.
maxlinelen, if given, overrides the instance's value for the maximumline length.
linesep specifies the characters used to separate the lines of thefolded header. It defaults to the most useful value for Pythonapplication code (
\n
), but\r\n
can be specified in orderto produce headers with RFC-compliant line separators.在 3.2 版的變更:新增引數linesep。
The
Header
class also provides a number of methods to supportstandard operators and built-in functions.- __str__()¶
Returns an approximation of the
Header
as a string, using anunlimited line length. All pieces are converted to unicode using thespecified encoding and joined together appropriately. Any pieces with acharset of'unknown-8bit'
are decoded as ASCII using the'replace'
error handler.在 3.2 版的變更:Added handling for the
'unknown-8bit'
charset.
Theemail.header
module also provides the following convenient functions.
- email.header.decode_header(header)¶
Decode a message header value without converting the character set. The headervalue is inheader.
This function returns a list of
(decoded_string,charset)
pairs containingeach of the decoded parts of the header.charset isNone
for non-encodedparts of the header, otherwise a lower case string containing the name of thecharacter set specified in the encoded string.以下是個範例:
>>>fromemail.headerimportdecode_header>>>decode_header('=?iso-8859-1?q?p=F6stal?=')[(b'p\xf6stal', 'iso-8859-1')]
- email.header.make_header(decoded_seq,maxlinelen=None,header_name=None,continuation_ws='')¶
Create a
Header
instance from a sequence of pairs as returned bydecode_header()
.decode_header()
takes a header value string and returns a sequence ofpairs of the format(decoded_string,charset)
wherecharset is the name ofthe character set.This function takes one of those sequence of pairs and returns a
Header
instance. Optionalmaxlinelen,header_name, andcontinuation_ws are as in theHeader
constructor.