Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

pygettext --docstrings doesn't actually extract module docstring due to tokenize returning ENCODING token #95731

Closed
Labels
type-bugAn unexpected behavior, bug, or error
@Jackenmen

Description

@Jackenmen

Bug report

When runningpygettext --docstrings file.py on Python 3.7 and above, the module docstring does not get extracted.

Reproduction steps:

  1. Createrepro.py with the following contents (actually you can omit everything but the first three lines):
"""Module docstring"""classX:"""class docstring"""defmethod(self):"""method docstring"""deffunction():"""function docstring"""
  1. Try running:python pygettext.py --docstrings repro.py
  2. Look at themessages.pot that was created and see that it doesn't contain the module docstring:
# SOME DESCRIPTIVE TITLE.# Copyright (C) YEAR ORGANIZATION# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.#msgid ""msgstr """Project-Id-Version: PACKAGE VERSION\n""POT-Creation-Date: 2022-08-06 00:54+0200\n""PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n""Last-Translator: FULL NAME <EMAIL@ADDRESS>\n""Language-Team: LANGUAGE <LL@li.org>\n""MIME-Version: 1.0\n""Content-Type: text/plain; charset=UTF-8\n""Content-Transfer-Encoding: 8bit\n""Generated-By: pygettext.py 1.5\n"#: repro.py:6#, docstringmsgid "class docstring"msgstr ""#: repro.py:9#, docstringmsgid "method docstring"msgstr ""#: repro.py:13#, docstringmsgid "function docstring"msgstr ""

The reason for this appears to be that pygettext doesn't account fortoken.ENCODING which was added in Python 3.7.

A simple solution for this would be to skiptokenize.ENCODING here:

elifttypenotin (tokenize.COMMENT,tokenize.NL):
self.__freshmodule=0
return

This actually reveals another bug which is caused by thereturn in the line 340 - detection of module docstring causes pygettext to swallow one token without handling it. This means that for a code like this:

class X:    """class docstring"""

pygettext will not extract the docstring of class X once the solution gets applied if proper care isn't taken. I'm mentioning it so that the fix is tested with both of these cases.

Your environment

  • CPython versions tested on: 3.7.13 (installed from deadsnakes ppa), 3.10.4 (default Python on my system)
  • Operating system and architecture: Ubuntu 22.04 LTS
    Thepygettext.py script was taken directly from this repository, I'm not sure that my distro even has a package that ships it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp