Movatterモバイル変換

bedevere-appbot mentioned this pull request

pygettext: Wrapping towidth is not implemented for msgids#130703

Open

Copy link

ContributorAuthor

StanFromIreland commentedFeb 28, 2025

Requesting@tomasr8 @serhiy-storchaka :-)

StanFromIreland added2 commits

February 28, 2025 19:31

Fix NEWS name -- We don't want miliseconds

33149ed

Change extract func in test

0e35e36

serhiy-storchaka reviewed

Copy link

Member

serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This does not work.

It can break escape sequences.
The normalized message can already be multiline. Splitting it again will produce too short lines and even empty lines.

StanFromIreland marked this pull request as draft

February 28, 2025 20:10

bedevere-appbot removed the awaiting review label

Copy link

ContributorAuthor

StanFromIreland commentedFeb 28, 2025

I need to update normalize to wrap respecting words

picnixz reviewed

Copy link

Member

picnixz left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can't you usetextwrap.wrap for wrapping? It's not perfect but it ought to detect do most of the job?

Copy link

Member

tomasr8 commentedFeb 28, 2025•
edited
Loading

I'm afraid textwrap won't always work. I suggest adding the wrapping logic to the normalize function. pybabel does it in a similar way, you can have a look at their implementation:https://github.com/python-babel/babel/blob/master/babel/messages/pofile.py#L464

Use a modified version of pybabel's code in normalize

92f227f

StanFromIreland requested review fromserhiy-storchaka andpicnixz

March 1, 2025 09:51

Copy link

ContributorAuthor

StanFromIreland commentedMar 1, 2025

Implemented pybabels method.

Minor tweak

f0ee9c4

picnixz reviewed

Copy link

Member

picnixz left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not really the best for reviewing this but I can review the implementation. Please, don't just apply my suggestions as is and decide which one is the best.

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

StanFromIrelandand others added2 commits

March 1, 2025 10:17

Update argparse snapshot

843e3fa

Bénédikt's suggestions

7fc34ca

StanFromIreland requested a review frompicnixz

March 1, 2025 10:19

StanFromIreland marked this pull request as ready for review

March 1, 2025 10:20

StanFromIreland requested a review fromsavannahostrowski as acode owner

March 1, 2025 10:20

bedevere-appbot added the awaiting review label

picnixz reviewed

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

StanFromIrelandand others added2 commits

March 1, 2025 11:03

Preserve spaces and remove unnecessary checks

8d319b4

Improve comment

9197688

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>

picnixz reviewed

Copy link

Member

picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure that the regex approach is correct. It would gobble up consecutive spaces right?

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

Add test and sort imports

7c8637e

StanFromIreland requested a review frompicnixz

March 1, 2025 11:19

More of Benedikt's suggestions

4b02678

StanFromIreland requested a review frompicnixz

March 2, 2025 09:59

Copy link

Member

tomasr8 commentedMar 2, 2025

I really recommend creating a dummy file with some gettext calls and comparing the differences between pygettext, xgettext and babel. There are some differences that should be considered. Here's two I noticed:

The header is not wrapped but both xgettext and babel do wrap it.
This file:

_('foos')

ran with--width=3 produces this output:

msgid """""foos"msgstr ""

while xgettext and babel give me this (i.e. they don't insert two extra"" when the line does not get wrapped):

msgid"foos"msgstr""

Don't wrap for single words

8d03cbf

Copy link

ContributorAuthor

StanFromIreland commentedMar 2, 2025•
edited
Loading

As for the header, this will conflict with my implementation of--omit-header, could that get merged first (or vice versa)?

Copy link

ContributorAuthor

StanFromIreland commentedMar 2, 2025•
edited
Loading

Test fail unrelated.

Wrapping header will require a separate function like so:

Subject: [PATCH] Wrap header---Index: Tools/i18n/pygettext.pyIDEA additional info:Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP<+>UTF-8===================================================================diff --git a/Tools/i18n/pygettext.py b/Tools/i18n/pygettext.py--- a/Tools/i18n/pygettext.py(revision 8d03cbf141068c4ac9812a967a4c9f5942e22d75)+++ b/Tools/i18n/pygettext.py(date 1740913002600)@@ -589,12 +589,36 @@     def _is_string_const(self, node):         return isinstance(node, ast.Constant) and isinstance(node.value, str)++def _wrap_header(s, options):+    lines = []+    for line in s.splitlines():+        if len(line) > options.width and ' ' in line:+            words = _space_splitter(line)+            words.reverse()+            buf = []+            size = 0+            while words:+                word = words.pop()+                if size + len(word) <= options.width:+                    buf.append(word)+                    size += len(word)+                else:+                    lines.append(''.join(buf))+                    buf = [word]+                    size = len(word)+            lines.append(''.join(buf))+        else:+            lines.append(line)+    return "\n".join(lines) + "\n"++ def write_pot_file(messages, options, fp):     timestamp = time.strftime('%Y-%m-%d %H:%M%z')     encoding = fp.encoding if fp.encoding else 'UTF-8'-    print(pot_header % {'time': timestamp, 'version': __version__,+    print(_wrap_header(pot_header % {'time': timestamp, 'version': __version__,                         'charset': encoding,-                        'encoding': '8bit'}, file=fp)+                        'encoding': '8bit'}, options), file=fp)      # Sort locations within each message by filename and lineno     sorted_keys = [

serhiy-storchaka reviewed

Mar 2, 2025

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

Address Serhiy's suggestions

fbe5b93

StanFromIreland requested a review fromserhiy-storchaka

March 2, 2025 15:02

Use more complex pattern

8d5f84f

serhiy-storchaka reviewed

Mar 2, 2025

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

Tools/i18n/pygettext.pyShow resolvedHide resolved

Tools/i18n/pygettext.py OutdatedShow resolvedHide resolved

picnixz removed their request for review

March 2, 2025 17:18

Serhiy's suggestions

ae53774

StanFromIreland requested a review fromserhiy-storchaka

March 2, 2025 17:22

serhiy-storchaka reviewed

Mar 3, 2025