Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork214
feat: add scripts that helps inserting Google-translated content into .po file#378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Merged
Uh oh!
There was an error while loading.Please reload this page.
Merged
Changes fromall commits
Commits
Show all changes
7 commits Select commitHold shift + click to select a range
e9e54ee
feat(script): add script for generating po with google-translted content
mattwang44796e24b
deps(script): adopt poetry
mattwang44e6bf0a8
feat(script): add shell script for running the google translation helper
mattwang44ee50c42
feat(script): move from_cn.sh to .scripts
mattwang44f9ca7db
fix(script): add poetry lock cmd to script and rm -q option when install
mattwang449637921
Merge branch '3.11' into googletrans-utils
josixab33603
Merge branch '3.11' into googletrans-utils
josixFile filter
Filter by extension
Conversations
Failed to load comments.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Jump to file
Failed to load files.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
There are no files selected for viewing
20 changes: 20 additions & 0 deletions.scripts/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Scripts | ||
Useful scripts for the translation. | ||
## From Google Translation | ||
Translate all untranslated entries of the given .po file with Google Translate. | ||
```sh | ||
.scripts/google_translate.sh library/csv.po | ||
``` | ||
## From zh_CN Translation | ||
If a specific doc has been translated into Simplified Chinese (zh_CN) and you'd like to adopt it as a base, you can insert the command: | ||
```sh | ||
.scripts/from_cn.sh library/csv.po | ||
``` |
44 changes: 44 additions & 0 deletions.scripts/from_cn.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
#!/bin/sh | ||
cd .scripts | ||
source utils/install_poetry.sh | ||
# check if OpenCC is installed | ||
if [[ ! -x "`which opencc 2>/dev/null`" ]] | ||
then | ||
echo "You do not have OpenCC installed. Please install it first." | ||
echo "Instruction: https://github.com/BYVoid/OpenCC/wiki/Download" | ||
exit 1 | ||
fi | ||
# clone pydoc zh_CN repo and pull from remote | ||
CN_REPO=.python-docs-zh-cn | ||
if [[ ! -d $CN_REPO ]] | ||
then | ||
read -p "You do not have a clone of zh_CN repo. Clone now? (y/N)" choice | ||
case "$choice" in | ||
y|Y ) git clone --depth 1 --no-single-branch https://github.com/python/python-docs-zh-cn $CN_REPO ;; | ||
n|N|* ) echo "Aborted"; exit 1 ;; | ||
esac | ||
fi | ||
git -C $CN_REPO checkout 3.10 # the current latest version of CN repo | ||
git -C $CN_REPO pull | ||
# convert zh_CN po content and merge into zh_TW po | ||
TARGET=$1 | ||
CN_PATH=$CN_REPO/$TARGET | ||
TW_PATH=../$TARGET | ||
poetry lock | ||
poetry install | ||
poetry run bash -c " | ||
opencc -i $CN_PATH -c s2twp.json -o /tmp/tmp.po | ||
pofilter --nonotes --excludefilter unchanged --excludefilter untranslated /tmp/tmp.po | msgattrib --set-fuzzy -o /tmp/tmp.po | ||
pomerge -t $CN_PATH -i /tmp/tmp.po -o /tmp/tmp.po | ||
pofilter --nonotes --excludefilter untranslated $TW_PATH /tmp/tmp2.po | ||
pomerge -t /tmp/tmp.po -i /tmp/tmp2.po -o /tmp/tmp3.po | ||
msgcat --lang zh_TW /tmp/tmp3.po -o $TW_PATH | ||
" | ||
rm /tmp/tmp.po /tmp/tmp2.po /tmp/tmp3.po |
17 changes: 17 additions & 0 deletions.scripts/google_translate.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
#!/bin/sh | ||
WORK_DIR=.scripts | ||
cd $WORK_DIR | ||
source utils/install_poetry.sh | ||
TEMP=tmp.po | ||
TARGET=../$1 | ||
poetry lock | ||
poetry install | ||
poetry run bash -c " | ||
python google_translate/main.py $TARGET > $TEMP | ||
pomerge -t $TARGET -i $TEMP -o $TARGET | ||
" | ||
rm $TEMP |
51 changes: 51 additions & 0 deletions.scripts/google_translate/main.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
import argparse | ||
import logging | ||
from pathlib import Path | ||
from typing import List | ||
import polib | ||
from googletrans import Translator | ||
from utils import refine_translations | ||
def _get_po_paths(path: Path) -> List[Path]: | ||
"""Find all .po files in given path""" | ||
if not path.exists(): | ||
logging.error(f"The path '{path.absolute()}' does not exist!") | ||
# return 1-element list if it's a file | ||
if path.is_file(): | ||
return [path.resolve()] | ||
# find all .po files | ||
po_paths = [p.resolve() for p in path.glob("**/*.po")] | ||
return po_paths | ||
if __name__ == '__main__': | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument( | ||
"path", | ||
help="the path of a PO file or a directory containing PO files" | ||
) | ||
args = parser.parse_args() | ||
translator = Translator() | ||
po_files = _get_po_paths(Path(args.path).resolve()) | ||
errors = [] | ||
for path in po_files: | ||
try: | ||
pofile = polib.pofile(path) | ||
except OSError: | ||
errors.append(f"{path} doesn't seem to be a .po file") | ||
continue | ||
for entry in pofile.untranslated_entries()[::-1]: | ||
translation = translator.translate(entry.msgid, src='en', dest='zh-TW') | ||
print( | ||
'#, fuzzy\n' | ||
f'msgid "{repr(entry.msgid)[1:-1]}"\n' | ||
f'msgstr "{repr(refine_translations(translation.text))[1:-1]}"\n' | ||
) |
56 changes: 56 additions & 0 deletions.scripts/google_translate/utils.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
MAPPING_ZH_TW_COMMON_TRANSLATION_ERROR = { | ||
'創建': '建立', # create | ||
'代碼': '程式碼', # code | ||
'信息': '資訊', # information | ||
'模塊': '模組', # module | ||
'標誌': '旗標', # flag | ||
'異常': '例外', # exception | ||
'解釋器': '直譯器', # interpreter | ||
'頭文件': '標頭檔', # header | ||
'對象': '物件', # objetc | ||
'支持': '支援', # support | ||
'默認': '預設', # default | ||
'兼容': '相容', # compatible | ||
'字符串': '字串', # string | ||
'宏': '巨集', # macro | ||
'描述符': '描述器', # descriptor | ||
'字節': '位元組', # bytes | ||
'緩存': '快取', # cache | ||
'調用': '呼叫', # call | ||
'哈希': '雜湊', # hash | ||
'類型': '型別', # type | ||
'子類': '子類別', # subclass | ||
'實現': '實作', # implement | ||
'數據': '資料', # data | ||
'返回': '回傳', # return | ||
'指針': '指標', # pointer | ||
'字段': '欄位', # field | ||
'擴展': '擴充', # extension | ||
'遞歸': '遞迴', # recursive | ||
'用戶': '使用者', # user | ||
'算法': '演算法', # algorithm | ||
'優化': '最佳化', # optimize | ||
'字符': '字元', # character | ||
'設置': '設定', # setting/configure | ||
'線程': '執行緒', # thread | ||
'進程': '行程', # process | ||
'迭代': '疊代', # iterate | ||
'內存': '記憶體', # memory | ||
'打印': '印出', # print | ||
'異步': '非同步', # async | ||
'調試': '除錯', # debug | ||
'堆棧': '堆疊', # stack | ||
'回調': '回呼', # callback | ||
'公共': '公開', # public | ||
'函數': '函式', # function | ||
'變量': '變數', # variable | ||
'常量': '常數', # constant | ||
'添加': '新增', # add | ||
'基類': '基底類別', # base class | ||
} | ||
def refine_translations(s: str) -> str: | ||
for original, target in MAPPING_ZH_TW_COMMON_TRANSLATION_ERROR.items(): | ||
s = s.replace(original, target) | ||
return s |
Oops, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.