Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fast and robust e-mail parsing library for Rust

NotificationsYou must be signed in to change notification settings

stalwartlabs/mail-parser

Repository files navigation

crates.iobuilddocs.rscrates.io

mail-parser is ane-mail parsing library written in Rust that fully conforms to the Internet Message Format standard (RFC 5322), theMultipurpose Internet Mail Extensions (MIME;RFC 2045 - 2049) as well as many otherinternet messaging RFCs.

It also supports decoding messages in41 different character sets including obsolete formats such as UTF-7.All Unicode (UTF-*) and single-byte character sets are handled internally by the library while support for legacy multi-byte encodings of Chineseand Japanese languages such as BIG5 or ISO-2022-JP is provided by the optional dependencyencoding_rs.

In general, this library abides by the Postel's law orRobustness Principle whichstates that an implementation must be conservative in its sending behavior and liberal in its receiving behavior. This means thatmail-parser will make a best effort to parse non-conformant e-mail messages as long as these do not deviate too much from the standard.

Unlike other e-mail parsing libraries that return nested representations of the different MIME parts in a message, this libraryconforms toRFC 8621, Section 4.1.4 and provides a more human-friendlyrepresentation of the message contents consisting of just text body parts, html body parts and attachments. Additionally, conversion to/fromHTML and plain text inline body parts is done automatically when thealternative version is missing.

Performance and memory safety were two important factors while designingmail-parser:

  • Zero-copy: Practically all strings returned by this library areCow<str> references to the input raw message.
  • High performance Base64 decoding based on Chromium's decoder (the fastest non-SIMD decoder).
  • Fast parsing of message header fields, character set names and HTML entities usingperfect hashing.
  • Written in100% safe Rust with no external dependencies.
  • Every function in the library has beenfuzzed and thoroughlytested with MIRI.
  • Battle-tested with millions of real-world e-mail messages dating from 1995 until today.
  • Used in production environments worldwide byStalwart Mail Server.

Usage Example

let input =br#"From: Art Vandelay <art@vandelay.com> (Vandelay Industries)To: "Colleagues": "James Smythe" <james@vandelay.com>; Friends:    jane@example.com, =?UTF-8?Q?John_Sm=C3=AEth?= <john@example.com>;Date: Sat, 20 Nov 2021 14:22:01 -0800Subject: Why not both importing AND exporting? =?utf-8?b?4pi6?=Content-Type: multipart/mixed; boundary="festivus";--festivusContent-Type: text/html; charset="us-ascii"Content-Transfer-Encoding: base64PGh0bWw+PHA+SSB3YXMgdGhpbmtpbmcgYWJvdXQgcXVpdHRpbmcgdGhlICZsZHF1bztleHBvcnRpbmcmcmRxdW87IHRvIGZvY3VzIGp1c3Qgb24gdGhlICZsZHF1bztpbXBvcnRpbmcmcmRxdW87LDwvcD48cD5idXQgdGhlbiBJIHRob3VnaHQsIHdoeSBub3QgZG8gYm90aD8gJiN4MjYzQTs8L3A+PC9odG1sPg==--festivusContent-Type: message/rfc822From: "Cosmo Kramer" <kramer@kramerica.com>Subject: Exporting my book about coffee tablesContent-Type: multipart/mixed; boundary="giddyup";--giddyupContent-Type: text/plain; charset="utf-16"Content-Transfer-Encoding: quoted-printable=FF=FE=0C!5=D8"=DD5=D8)=DD5=D8-=DD =005=D8*=DD5=D8"=DD =005=D8"==DD5=D85=DD5=D8-=DD5=D8,=DD5=D8/=DD5=D81=DD =005=D8*=DD5=D86=DD ==005=D8=1F=DD5=D8,=DD5=D8,=DD5=D8(=DD =005=D8-=DD5=D8)=DD5=D8"==DD5=D8=1E=DD5=D80=DD5=D8"=DD!=00--giddyupContent-Type: image/gif; name*1="about "; name*0="Book ";              name*2*=utf-8''%e2%98%95 tables.gifContent-Transfer-Encoding: Base64Content-Disposition: attachmentR0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7--giddyup----festivus--"#;let message =MessageParser::default().parse(input).unwrap();// Parses addresses (including comments), lists and groupsassert_eq!(        message.from().unwrap().first().unwrap(),&Addr::new("Art Vandelay (Vandelay Industries)".into(),"art@vandelay.com"));assert_eq!(        message.to().unwrap().as_group().unwrap(),&[Group::new("Colleagues",                vec![Addr::new("James Smythe".into(),"james@vandelay.com")]),Group::new("Friends",                vec![Addr::new(None,"jane@example.com"),Addr::new("John Smîth".into(),"john@example.com"),])]);assert_eq!(        message.date().unwrap().to_rfc3339(),"2021-11-20T14:22:01-08:00");// RFC2047 support for encoded text in message readersassert_eq!(        message.subject().unwrap(),"Why not both importing AND exporting? ☺");// HTML and text body parts are returned conforming to RFC8621, Section 4.1.4assert_eq!(        message.body_html(0).unwrap(),        concat!("<html><p>I was thinking about quitting the &ldquo;exporting&rdquo; to ","focus just on the &ldquo;importing&rdquo;,</p><p>but then I thought,"," why not do both? &#x263A;</p></html>"));// HTML parts are converted to plain text (and viceversa) when missingassert_eq!(        message.body_text(0).unwrap(),        concat!("I was thinking about quitting the “exporting” to focus just on the"," “importing”,\nbut then I thought, why not do both? ☺\n"));// Supports nested messages as well as multipart/digestlet nested_message = message.attachment(0).unwrap().message();.unwrap();assert_eq!(        nested_message.subject().unwrap(),"Exporting my book about coffee tables");// Handles UTF-* as well as many legacy encodingsassert_eq!(        nested_message.body_text(0).unwrap(),"ℌ𝔢𝔩𝔭 𝔪𝔢 𝔢𝔵𝔭𝔬𝔯𝔱 𝔪𝔶 𝔟𝔬𝔬𝔨 𝔭𝔩𝔢𝔞𝔰𝔢!");assert_eq!(        nested_message.body_html(0).unwrap(),"<html><body>ℌ𝔢𝔩𝔭 𝔪𝔢 𝔢𝔵𝔭𝔬𝔯𝔱 𝔪𝔶 𝔟𝔬𝔬𝔨 𝔭𝔩𝔢𝔞𝔰𝔢!</body></html>");let nested_attachment = nested_message.attachment(0).unwrap();assert_eq!(nested_attachment.len(),42);// Full RFC2231 support for continuations and character setsassert_eq!(        nested_attachment.attachment_name().unwrap(),"Book about ☕ tables.gif");// Integrates with Serdeprintln!("{}", serde_json::to_string_pretty(&message).unwrap());

More examples available under theexamples directory. Please note that this library does not support building e-mail messages as this functionality is provided separately by themail-builder crate.

Testing, Fuzzing & Benchmarking

To run the testsuite:

 $ cargotest --all-features

or, to run the testsuite with MIRI:

 $ cargo +nightly miritest --all-features

To fuzz the library withcargo-fuzz:

 $ cargo +nightly fuzz run mail_parser

and, to run the benchmarks:

 $ cargo +nightly bench --all-features

Conformed RFCs

Supported Character Sets

  • UTF-8
  • UTF-16, UTF-16BE, UTF-16LE
  • UTF-7
  • US-ASCII
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • ISO-8859-10
  • ISO-8859-13
  • ISO-8859-14
  • ISO-8859-15
  • ISO-8859-16
  • CP1250
  • CP1251
  • CP1252
  • CP1253
  • CP1254
  • CP1255
  • CP1256
  • CP1257
  • CP1258
  • KOI8-R
  • KOI8_U
  • MACINTOSH
  • IBM850
  • TIS-620

Supported character sets via the optional dependencyencoding_rs:

  • SHIFT_JIS
  • BIG5
  • EUC-JP
  • EUC-KR
  • GB18030
  • GBK
  • ISO-2022-JP
  • WINDOWS-874
  • IBM-866

License

Licensed under either of

at your option.

Copyright

Copyright (C) 2020, Stalwart Labs LLC

About

Fast and robust e-mail parsing library for Rust

Topics

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Sponsor this project

  •  

Contributors16

Languages


[8]ページ先頭

©2009-2025 Movatter.jp