Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fast parser for IMAP BODYSTRUCTURE responses

License

NotificationsYou must be signed in to change notification settings

kappa/IMAP-BodyStructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMAP::BodyStructure - IMAP4-compatible BODYSTRUCTURE and ENVELOPE parser

SYNOPSIS

use IMAP::BodyStructure;# $imap is a low-level IMAP-client with an ability to fetch items# by message uidsmy $bs = new IMAP::BodyStructure    $imap->imap_fetch($msg_uid,            'BODYSTRUCTURE', 1)->[0]->{BODYSTRUCTURE};print "[UID:$msg_uid] message is in Russian. Sure.\n"    if $bs->charset =~ /(?:koi8-r|windows-1251)/i;my $part = $bs->part_at('1.3');$part->type =~ m#^image/#    and print "The 3rd part is an image named \""        . $part->filename . "\"\n";

DESCRIPTION

An IMAP4-compatible IMAP server MUST include a full MIME-parser whichparses the messages inside IMAP mailboxes and is accessible viaBODYSTRUCTURE fetch item. This module provides a Perl interface toparse the output of IMAP4 MIME-parser. Hope no one will have problemswith parsing this doc.

It is a rather straightforwardm/\G.../gc-style parser and istherefore much, much faster then the venerableMail::IMAPClient::BodyStructurewhich is based on aParse::RecDescent grammar. I believe it is alsomore correct when parsing nested multipartmessage/rfc822 parts. Seetestsuite if interested.

I'd also like to emphasize thatthis module does not contain IMAP4client! You will need to employ one from CPAN, there are many. Asection with examples of getting to a BODYSTRUCTURE fetch item withvarious Perl IMAP clients available on CPAN would greatlyenhance this document.

INTERFACE

METHODS

  • new($)

    The constructor does most of the work here. It initializes thehierarchial data structure representing all the message parts and theirproperties. It takes one argument which should be a string returnedby IMAP server in reply to a FETCH command with BODYSTRUCTURE item.

    All the parts on all the levels are represented by IMAP::BodyStructureobjects and that enables the uniform access to them. It is a directimplementation of the Composite Design Pattern.

  • type()

    Returns the MIME type of the part. Expect something liketext/plainorapplication/octet-stream.

  • encoding()

    Returns the MIME encoding of the part. This is usually one of '7bit','8bit', 'base64' or 'quoted-printable'.

  • size()

    Returns the size of the part in octets. It isNOT the size of thedata in the part, which may be encoded as quoted-printable leaving uswithout an obvious method of calculating the exact size of originaldata.

  • disp()

    Returns the content-disposition of the part. One of 'inline' or'attachment', usually. Defaults to inline, but you should rememberthat if there IS a disposition but you cannot recognize it than act asif it's 'attachment'. And use case-insensitive comparisons.

  • charset()

    Returns the charset of the part OR the charset of the first nestedpart. This looks like a good heuristic really. Charset is somethingresembling 'UTF-8', 'US-ASCII', 'ISO-8859-13' or 'KOI8-R'. The standarddoes not say it should be uppercase, by the way.

    Can be undefined.

  • filename()

    Returns the filename specified as a part of Content-Dispositionheader.

    Can be undefined.

  • description()

    Returns the description of the part.

  • parts(;$)

    This sub acts differently depending on whether you pass it anargument or not.

    Without any arguments it returns a list of parts in list context andthe number in scalar context.

    Specifying a scalar argument allows you to get an individual part withthat index.

    Remember, all the parts I talk here about are not actual message data, filesetc. but IMAP::BodyStructure objects containing information about themessage parts which was extracted from parsing BODYSTRUCTURE IMAPresponse!

  • part_at($)

    This method returns a message part by its path. A path to a part inthe hierarchy is a dot-separated string of part indices. See"SYNOPSIS" foran example. A nestedmessage/rfc822 does not add a hierarchy levelUNLESS it is a single part of anothermessage/rfc822 part (with nomultipart/* levels in between). Instead, it has an additional.TEXT part which refers to the internal IMAP::BodyStructure object.Look, here is an outline of an example message structure with partpaths alongside each part.

      multipart/mixed                   1      text/plain                    1.1      application/msword            1.2      message/rfc822                1.3          multipart/alternative     1.3.TEXT              text/plain            1.3.1              multipart/related     1.3.2                  text/html         1.3.2.1                  image/png         1.3.2.2                  image/png         1.3.2.3

    This is a text email with two attachments, one being an MS Word document,and the other is itself a message (probably a forward) which is composed in agraphical MUA and contains two alternative representations, oneplain text fallback and one HTML with images (bundled as amultipart/related).

    Another one with several levels ofmessage/rfc822. This one is hardto compose in a modern MUA, however.

      multipart/mixed                   1      text/plain                    1.1      message/rfc822                1.2          message/rfc822            1.2.TEXT              text/plain            1.2.1
  • part_path()

    Returns the part path to the current part.

DATA MEMBERS

These are additional pieces of information returned by IMAP server andparsed. They are rarely used, though (and rarely defined too, btw), soI chose not to provide access methods for them.

  • params

    This is a hashref of MIME parameters. The only interesting param ischarset and there's a shortcut method for it.

  • lang

    Content language.

  • loc

    Content location.

  • cid

    Content ID.

  • md5

    Content MD5. No one seems to bother with calculating and it is usuallyundefined.

cid andmd5 members exist only in singlepart parts.

  • get_enveleope($)

    Parses a string into IMAP::BodyStructure::Envelope object. See below.

IMAP::BodyStructure::Envelope CLASS

Every message on an IMAP server has an envelope. You can get itusing ENVELOPE fetch item or, and this is relevant, from BODYSTRUCTUREresponse in case there are some nested messages (parts with type ofmessage/rfc822). So, if we have a part with such a type then thecorresponding IMAP::BodyStructure object always hasenvelope data member which is, in turn, an object ofIMAP::BodyStructure::Envelope.

You can of course use this satellite class on its own, this is veryuseful when generating meaningful message lists in IMAP folders.

METHODS

  • new($)

    The constructor create Envelope object from string which should be anIMAP server respone to a fetch with ENVELOPE item or a substring ofBODYSTRUCTURE response for a message with message/rfc822 parts inside.

DATA MEMBERS

  • date

    Date of the message as specified in the envelope. Not the IMAPINTERNALDATE, be careful!

  • subject

    Subject of the message, may be RFC2047 encoded, of course.

  • message_id

  • in_reply_to

    Message-IDs of the current message and the message in reply to whichthis one was composed.

  • to, from, cc, bcc, sender, reply_to

    These are the so called address-lists or just arrays of addresses.Remember, a message may be addressed to lots of people.

    Each address is a hash of four elements:

    • name

      The informal part, "A.U.Thor" from "A.U.Thor, <a.u.thor@somewhere.com>"

    • sroute

      Source-routing information, not used. (By the way, IMAP4r1 spec wasborn after the last email address with sroute passed away.)

    • account

      The part before @.

    • host

      The part after @.

    • full

      The full address for display purposes.

EXAMPLES

The usual way to determine if an email has some files attached (inorder to display a cute little scrap in the message list, e.g.) is tocheck whether the message is multipart or not. This method tends togive many false positives on multipart/alternative messages with aHTML and plaintext parts and no files. The following sub tries to be alittle smarter.

sub _has_files {    my $bs = shift;    return 1 if $bs->{type} !~ m#^(?:text|multipart)/#;    if ($bs->{type} =~ m#^multipart/#) {        foreach my $part (@{$bs->{parts}}) {            return 1 if _has_files($part);        }    }    return 0;}

This snippet selects a rendering routine for a message part.

foreach (    [ qr{text/plain}            => \&_render_textplain  ],    [ qr{text/html}             => \&_render_texthtml   ],    [ qr{multipart/alternative} => \&_render_alt        ],    [ qr{multipart/mixed}       => \&_render_mixed      ],    [ qr{multipart/related}     => \&_render_related    ],    [ qr{image/}                => \&_render_image      ],    [ qr{message/rfc822}        => \&_render_rfc822     ],    [ qr{multipart/parallel}    => \&_render_mixed      ],    [ qr{multipart/report}      => \&_render_mixed      ],    [ qr{multipart/}            => \&_render_mixed      ],    [ qr{text/}                 => \&_render_textplain  ],    [ qr{message/delivery-status}=> \&_render_textplain ],) {    $bs->type =~ $_->[0]        and $renderer = $_->[1]        and last;}

BUGS

Shouldn't be any, as this is a simple parser of a standard structure.

AUTHOR

Alex Kapranoff <alex@kapranoff.ru>

ACKNOWLEDGMENTS

Jonas Liljegren contributed support for multivalued "lang" items.

COPYRIGHT AND LICENSE

This software is copyright (C) 2015 by Alex Kapranoff <alex@kapranoff.ru>.

This is free software; you can redistribute it and/or modify it underthe terms GNU General Public License version 3.

SEE ALSO

Mail::IMAPClient,Net::IMAP::Simple, RFC3501, RFC2045, RFC2046.

About

Fast parser for IMAP BODYSTRUCTURE responses

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp