# Description of Changes <!-- Please provide a summary of the changes, including: - What was changed - Why the change was made - Any challenges encountered Closes #(issue_number) --> --- ## Checklist ### General - [ ] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [ ] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [ ] I have performed a self-review of my own code - [ ] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### UI Changes (if applicable) - [ ] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [ ] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details.
14 KiB
Translation Management Scripts
This directory contains Python scripts for managing frontend translations in Stirling PDF. These tools help analyze, merge, and manage translations against the en-GB golden truth file.
Scripts Overview
1. translation_analyzer.py
Analyzes translation files to find missing translations, untranslated entries, and provides completion statistics.
Usage:
# Analyze all languages
python scripts/translations/translation_analyzer.py
# Analyze specific language
python scripts/translations/translation_analyzer.py --language fr-FR
# Show only missing translations
python scripts/translations/translation_analyzer.py --missing-only
# Show only untranslated entries
python scripts/translations/translation_analyzer.py --untranslated-only
# Show summary only
python scripts/translations/translation_analyzer.py --summary
# JSON output format
python scripts/translations/translation_analyzer.py --format json
Features:
- Finds missing translation keys
- Identifies untranslated entries (identical to en-GB and [UNTRANSLATED] markers)
- Shows accurate completion percentages using ignore patterns
- Identifies extra keys not in en-GB
- Supports JSON and text output formats
- Uses
scripts/ignore_translation.toml
for language-specific exclusions
2. translation_merger.py
Merges missing translations from en-GB into target language files and manages translation workflows.
Usage:
# Add missing translations from en-GB to French
python scripts/translations/translation_merger.py fr-FR add-missing
# Add without marking as [UNTRANSLATED]
python scripts/translations/translation_merger.py fr-FR add-missing --no-mark-untranslated
# Extract untranslated entries to a file
python scripts/translations/translation_merger.py fr-FR extract-untranslated --output fr_untranslated.json
# Create a template for AI translation
python scripts/translations/translation_merger.py fr-FR create-template --output fr_template.json
# Apply translations from a file
python scripts/translations/translation_merger.py fr-FR apply-translations --translations-file fr_translated.json
Features:
- Adds missing keys from en-GB with optional [UNTRANSLATED] markers
- Extracts untranslated entries for external translation
- Creates structured templates for AI translation
- Applies translated content back to language files
- Automatic backup creation
3. ai_translation_helper.py
Specialized tool for AI-assisted translation workflows with batch processing and validation.
Usage:
# Create batch file for AI translation (multiple languages)
python scripts/translations/ai_translation_helper.py create-batch --languages fr-FR de-DE es-ES --output batch.json --max-entries 50
# Validate AI translations
python scripts/translations/ai_translation_helper.py validate batch.json
# Apply validated AI translations
python scripts/translations/ai_translation_helper.py apply-batch batch.json
# Export for external translation services
python scripts/translations/ai_translation_helper.py export --languages fr-FR de-DE --format csv
Features:
- Creates batch files for AI translation of multiple languages
- Prioritizes important translation keys
- Validates translations for placeholders and artifacts
- Applies batch translations with validation
- Exports to CSV/JSON for external translation services
4. compact_translator.py
Extracts untranslated entries in minimal JSON format for character-limited AI services.
Usage:
# Extract all untranslated entries
python scripts/translations/compact_translator.py it-IT --output to_translate.json
Features:
- Produces minimal JSON output with no extra whitespace
- Automatic ignore patterns for cleaner output
- Batch size control for manageable chunks
- 50-80% fewer characters than other extraction methods
5. json_beautifier.py
Restructures and beautifies translation JSON files to match en-GB structure exactly.
Usage:
# Restructure single language to match en-GB structure
python scripts/translations/json_beautifier.py --language de-DE
# Restructure all languages
python scripts/translations/json_beautifier.py --all-languages
# Validate structure without modifying files
python scripts/translations/json_beautifier.py --language de-DE --validate-only
# Skip backup creation
python scripts/translations/json_beautifier.py --language de-DE --no-backup
Features:
- Restructures JSON to match en-GB nested structure exactly
- Preserves key ordering for line-by-line comparison
- Creates automatic backups before modification
- Validates structure and key ordering
- Handles flattened dot-notation keys (e.g., "key.subkey") properly
Translation Workflows
Method 1: Compact Translation Workflow (RECOMMENDED for AI)
Best for character-limited AI services like Claude or ChatGPT
Step 1: Check Current Status
python scripts/translations/translation_analyzer.py --language it-IT --summary
Step 2: Extract Untranslated Entries
python scripts/translations/compact_translator.py it-IT --output to_translate.json
Output format: Compact JSON with minimal whitespace
{"key1":"English text","key2":"Another text","key3":"More text"}
Step 3: AI Translation
- Copy the compact JSON output
- Give it to your AI with instructions:
Translate this JSON to Italian. Keep the same structure, translate only the values. Preserve placeholders like {n}, {total}, {filename}, {{variable}}.
- Save the AI's response as
translated.json
Step 4: Apply Translations
python scripts/translations/translation_merger.py it-IT apply-translations --translations-file translated.json
Step 5: Verify Results
python scripts/translations/translation_analyzer.py --language it-IT --summary
Method 2: Batch Translation Workflow
For complete language translation from scratch or major updates
Step 1: Analyze Current State
python scripts/translations/translation_analyzer.py --language de-DE --summary
Step 2: Create Translation Batches
# Create batches of 100 entries each for systematic translation
python scripts/translations/ai_translation_helper.py create-batch --languages de-DE --output de_batch_1.json --max-entries 100
Step 3: Translate Batch with AI
Edit the batch file and fill in ALL translated
fields:
- Preserve all placeholders like
{n}
,{total}
,{filename}
,{{toolName}}
- Keep technical terms consistent
- Maintain JSON structure exactly
- Consider context provided for each entry
Step 4: Apply Translations
# Skip validation if using legitimate placeholders ({{variable}})
python scripts/translations/ai_translation_helper.py apply-batch de_batch_1.json --skip-validation
Step 5: Check Progress and Continue
python scripts/translations/translation_analyzer.py --language de-DE --summary
Repeat steps 2-5 until 100% complete.
Method 3: Quick Translation Workflow (Legacy)
For small updates or existing translations
Step 1: Add Missing Translations
python scripts/translations/translation_merger.py fr-FR add-missing --mark-untranslated
Step 2: Create AI Template
python scripts/translations/translation_merger.py fr-FR create-template --output fr_template.json
Step 3: Apply Translations
python scripts/translations/translation_merger.py fr-FR apply-translations --translations-file fr_translated.json
Translation File Structure
Translation files are located in frontend/public/locales/{language}/translation.json
with nested JSON structure:
{
"addPageNumbers": {
"title": "Add Page Numbers",
"selectText": {
"1": "Select PDF file:",
"2": "Margin Size"
}
}
}
Keys use dot notation internally (e.g., addPageNumbers.selectText.1
).
Key Features
Placeholder Preservation
All scripts preserve placeholders like {n}
, {total}
, {filename}
in translations:
"customNumberDesc": "Defaults to {n}, also accepts 'Page {n} of {total}'"
Automatic Backups
Scripts create timestamped backups before modifying files:
translation.backup.20241201_143022.json
Context-Aware Translation
Scripts provide context information to help with accurate translations:
{
"addPageNumbers.title": {
"original": "Add Page Numbers",
"context": "Feature for adding page numbers to PDFs"
}
}
Priority-Based Translation
Important keys (title, submit, error messages) are prioritized when limiting translation batch sizes.
Ignore Patterns System
The scripts/ignore_translation.toml
file defines keys that should be ignored for each language, improving completion accuracy.
Common ignore patterns:
language.direction
: Text direction (ltr/rtl) - universallang.*
: Language code entries not relevant to specific localespipeline.title
,home.devApi.title
: Technical terms kept in English- Specific technical IDs, version numbers, and system identifiers
Format:
[de_DE]
ignore = [
'language.direction',
'pipeline.title',
'lang.afr',
'lang.ceb',
# ... more patterns
]
Best Practices & Lessons Learned
Critical Rules for Translation
- NEVER skip entries: Translate ALL entries in each batch to avoid [UNTRANSLATED] pollution
- Use appropriate batch sizes: 100 entries for systematic translation, unlimited for compact method
- Skip validation for placeholders: Use
--skip-validation
when batch contains{{variable}}
patterns - Check progress between batches: Use
--summary
flag to track completion percentage - Preserve all placeholders: Keep
{n}
,{total}
,{filename}
,{{toolName}}
exactly as-is
Workflow Comparison
Method | Best For | Character Usage | Complexity | Speed |
---|---|---|---|---|
Compact | AI services | Minimal (50-80% less) | Simple | Fastest |
Batch | Systematic translation | Moderate | Medium | Medium |
Quick | Small updates | High | Low | Slow |
Common Issues and Solutions
[UNTRANSLATED] Pollution
Problem: Hundreds of [UNTRANSLATED] markers from incomplete translation attempts Solution:
- Only translate complete batches of manageable size
- Use analyzer that counts [UNTRANSLATED] as missing translations
- Restore from backup if pollution occurs
Validation False Positives
Problem: Validator flags legitimate {{variable}}
placeholders as artifacts
Solution: Use --skip-validation
flag when applying batches with template variables
JSON Structure Mismatches
Problem: Flattened dot-notation keys instead of proper nested objects
Solution: Use json_beautifier.py
to restructure files to match en-GB exactly
Real-World Examples
Complete Italian Translation (Compact Method)
# Check status
python scripts/translations/translation_analyzer.py --language it-IT --summary
# Result: 46.8% complete, 1147 missing
# Extract all entries for translation
python scripts/translations/compact_translator.py it-IT --output batch1.json
# [Translate batch1.json with AI, save as batch1_translated.json]
# Apply translations
python scripts/translations/translation_merger.py it-IT apply-translations --translations-file batch1_translated.json
# Result: Applied 1147 translations
# Check progress
python scripts/translations/translation_analyzer.py --language it-IT --summary
# Result: 100% complete, 0 missing
German Translation (Batch Method)
Starting from 46.3% completion, reaching 60.3% with batch method:
# Initial analysis
python scripts/translations/translation_analyzer.py --language de-DE --summary
# Result: 46.3% complete, 1142 missing entries
# Batch 1 (100 entries)
python scripts/translations/ai_translation_helper.py create-batch --languages de-DE --output de_batch_1.json --max-entries 100
# [Translate all 100 entries in batch file]
python scripts/translations/ai_translation_helper.py apply-batch de_batch_1.json --skip-validation
# Progress: 46.6% → 51.2%
# Continue with more batches until 100% complete
Error Handling
- Missing Files: Scripts create new files when language directories don't exist
- Invalid JSON: Clear error messages with line numbers
- Placeholder Mismatches: Validation warnings for missing or extra placeholders
- [UNTRANSLATED] Entries: Counted as missing translations to prevent pollution
- Backup Failures: Graceful handling with user notification
Integration with Development
These scripts integrate with the existing translation system:
- Works with the current
frontend/public/locales/
structure - Compatible with the i18n system used in the React frontend
- Respects the JSON format expected by the translation loader
- Maintains the nested structure required by the UI components
Language-Specific Notes
German Translation Notes
- Technical terms: Use German equivalents (PDF → PDF, API → API)
- UI actions: "hochladen" (upload), "herunterladen" (download), "speichern" (save)
- Error messages: Consistent pattern "Ein Fehler ist beim [action] aufgetreten"
- Formal address: Use "Sie" form for user-facing text
Italian Translation Notes
- Keep technical terms in English when commonly used (PDF, API, URL)
- Use formal address ("Lei" form) for user-facing text
- Error messages: "Si è verificato un errore durante [action]"
- UI actions: "carica" (upload), "scarica" (download), "salva" (save)
Common Use Cases
- Complete Language Translation: Use Compact Workflow for fastest AI-assisted translation
- New Language Addition: Start with compact workflow for comprehensive coverage
- Updating Existing Language: Use analyzer to find gaps, then compact or batch method
- Quality Assurance: Use analyzer with
--summary
for completion metrics and issue detection - External Translation Services: Use export functionality to generate CSV files for translators
- Structure Maintenance: Use json_beautifier to keep files aligned with en-GB structure