Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Possible performance improvement in email parsing #106628

Closed
Labels
performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytopic-emailtype-bugAn unexpected behavior, bug, or error
@cfbolz

Description

@cfbolz

PyPy received the following performance bug today:https://foss.heptapod.net/pypy/pypy/-/issues/3961

Somebody who was trying to process a lot of emails from an mbox file was complaining about terrible performance on PyPy. The problem turned out to be fact thatemail.feedparser.FeedParser._parsegen is compiling a new regular expression for every multipart message in the mbox file. On PyPy this is particularly bad, because those regular expressions are jitted and that costs even more time. However, even on CPython compiling these regular expressions takes a noticeable portion of the benchmark.

Ifixed this problem in PyPy by simply usingstr.startswith with the multipart separator, followed by a generic regular expression that can be used for arbitrary boundaries. In PyPy this helps massively, but in CPython it's still a 20% performance improvement. Will open a PR for it.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytopic-emailtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp