Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork216
Extract and build a translation dictionary for terminologies across different po files#1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Merged
Uh oh!
There was an error while loading.Please reload this page.
Merged
Changes fromall commits
Commits
Show all changes
5 commits Select commitHold shift + click to select a range
c63f0d6
Initial plan
Copilot516c417
Implement terminology extraction tools and generate translation dicti…
Copilotf722995
Remove extraction scripts, keep CSV dictionaries as requested
Copilot2400cb0
Regenerate CSV files with proper Python terminology and consolidation…
Copilot217fb79
Apply translation improvements from @mattwang44 feedback
CopilotFile filter
Filter by extension
Conversations
Failed to load comments.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Jump to file
Failed to load files.
Loading
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
There are no files selected for viewing
7 changes: 6 additions & 1 deletion.scripts/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
79 changes: 79 additions & 0 deletionsTERMINOLOGY_DICTIONARY.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# Python Documentation Translation Dictionary | ||
This document describes the terminology dictionaries for maintaining translation consistency across the Python documentation project. | ||
## Overview | ||
The translation dictionary project provides curated key terms and their translations to help translators maintain consistent terminology usage across different documents. The dictionaries are maintained using LLM knowledge to identify and categorize important Python terminology. | ||
## Generated Files | ||
### terminology_dictionary.csv | ||
The complete terminology dictionary containing important terms identified from Python documentation. Contains: | ||
- **source_term**: The original English term | ||
- **translated_term**: The corresponding Chinese (Traditional) translation | ||
- **frequency**: Number of occurrences across all files | ||
- **files_count**: Number of different files containing this term | ||
- **source_file**: Example file where this term was found | ||
- **directory**: Directory of the source file | ||
- **example_files**: List of up to 5 files containing this term | ||
Total entries: ~196 essential Python terms | ||
### focused_terminology_dictionary.csv | ||
A curated subset of ~118 terms focusing on the most important Python terminology. Includes additional columns: | ||
- **priority**: High/Medium priority classification | ||
- **category**: Term classification | ||
#### Categories: | ||
- **Core Concepts** (7 terms): class, function, method, module, package, object, type | ||
- **Built-in Types** (9 terms): int, str, list, dict, tuple, set, float, bool, complex | ||
- **Keywords/Constants** (25 terms): None, True, False, return, import, def, async, await, and other Python keywords | ||
- **Exceptions** (29 terms): Common *Error and *Exception classes | ||
- **Code Elements** (14 terms): Magic methods like __init__, __str__, etc. | ||
- **Common Terms** (34 terms): Important technical concepts like decorator, generator, iterator | ||
## Maintenance | ||
The terminology dictionaries are maintained using LLM knowledge to identify important Python terms and their translations. The dictionaries can be updated as needed to reflect new terminology or improved translations. | ||
## Integration with Translation Workflow | ||
### For New Translators | ||
1. Start with `focused_terminology_dictionary.csv` | ||
2. Learn standard translations for core Python concepts | ||
3. Reference high-frequency terms for consistency | ||
### For Translation Review | ||
1. Check new translations against the dictionary | ||
2. Verify consistent terminology usage | ||
3. Update dictionary when establishing new standard translations | ||
### For Project Management | ||
1. Track translation progress for key technical terms | ||
2. Identify terminology needing standardization | ||
3. Prioritize translation efforts using frequency data | ||
### Output Format | ||
CSV files use UTF-8 encoding to properly handle Chinese characters. Compatible with Excel, Google Sheets, and other spreadsheet applications. | ||
## Maintenance | ||
### Adding New Terms | ||
New terms can be identified and added based on: | ||
- Frequency of appearance in documentation | ||
- Importance to Python concepts | ||
- Consistency needs across translation files | ||
### Manual Curation Process | ||
The dictionaries are maintained through careful analysis of: | ||
- Core Python terminology in official documentation | ||
- Existing translation patterns in .po files | ||
- Category-based organization for translator efficiency | ||
### Quality Assurance | ||
- Regular review of term translations for consistency | ||
- Cross-reference with official Python terminology | ||
- Validation against established translation conventions | ||
This documentation provides comprehensive guidance for maintaining and using the translation dictionary system to ensure consistent, high-quality Python documentation translation. |
Oops, something went wrong.
Uh oh!
There was an error while loading.Please reload this page.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.