fix(translations): improve translation merger CLI and sync missing UI strings across locales (#5309)

# Description of Changes

This pull request updates the Arabic translation file
(`frontend/public/locales/ar-AR/translation.toml`) with a large number
of new and improved strings, adding support for new features and
enhancing clarity and coverage across the application. Additionally, it
makes several improvements to the TOML language check script
(`.github/scripts/check_language_toml.py`) and updates the corresponding
GitHub Actions workflow to better track and validate translation
changes.

**Translation updates and enhancements:**

* Added translations for new features and UI elements, including
annotation tools, PDF/A-3b conversion, line art compression, background
removal, split modes, onboarding tours, and more.
[[1]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR343-R346)
[[2]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR442-R460)
[[3]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR514-R523)
[[4]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR739-R743)
[[5]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR1281-R1295)
[[6]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR1412-R1416)
[[7]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR2362-R2365)
[[8]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR2411-R2415)
[[9]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR2990)
[[10]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR3408-R3420)
[[11]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR3782-R3794)
[[12]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR3812-R3815)
[[13]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR3828-R3832)
[[14]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effL3974-R4157)
[[15]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR4208-R4221)
[[16]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5247)
[[17]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5414-R5423)
[[18]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5444-R5447)
* Improved and expanded coverage for settings, security, onboarding, and
help menus, including detailed descriptions and tooltips for new and
existing features.
[[1]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR442-R460)
[[2]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5247)
[[3]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5414-R5423)
[[4]](diffhunk://#diff-460d5f61a7649a5b149373af2e52a8a87d9a1964cf54240a78ad4747e7233effR5444-R5447)

**TOML language check script improvements:**

* Increased the maximum allowed TOML file size from 500 KB to 570 KB to
accommodate larger translation files.
* Improved file validation logic to more accurately skip or process
files based on directory structure and file type, and added informative
print statements for skipped files.
* Enhanced reporting in the difference check: now, instead of raising
exceptions for unsafe files or oversized files, the script logs warnings
and continues processing, improving robustness and clarity in CI
reports.
* Adjusted the placement of file check report lines for clarity in the
generated report.

**Workflow and CI improvements:**

* Updated the GitHub Actions workflow
(`.github/workflows/check_toml.yml`) to trigger on changes to the
translation script and workflow files, in addition to translation TOMLs,
ensuring all relevant changes are validated.

These changes collectively improve the translation quality and coverage
for Arabic users, enhance the reliability and clarity of the translation
validation process, and ensure smoother CI/CD workflows for localization
updates.

<img width="654" height="133" alt="image"
src="https://github.com/user-attachments/assets/9f3e505d-927f-4dc0-9098-cee70bbe85ca"
/>


---

## Checklist

### General

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.
This commit is contained in:
Ludy
2026-01-14 01:31:05 +01:00
committed by GitHub
parent db049a3467
commit 472ee54098
46 changed files with 16654 additions and 2060 deletions

View File

@@ -11,13 +11,16 @@ adjusting the format.
Usage:
python check_language_toml.py --reference-file <path_to_reference_file> --branch <branch_name> [--actor <actor_name>] [--files <list_of_changed_files>]
"""
# Sample for Windows:
# python .github/scripts/check_language_toml.py --reference-file frontend/public/locales/en-GB/translation.toml --branch "" --files frontend/public/locales/de-DE/translation.toml frontend/public/locales/fr-FR/translation.toml
import argparse
import glob
import os
import argparse
import re
from pathlib import Path
import tomllib # Python 3.11+ (stdlib)
import tomli_w # For writing TOML files
@@ -36,7 +39,8 @@ def find_duplicate_keys(file_path, keys=None, prefix=""):
duplicates = []
# Load TOML file
with open(file_path, "rb") as file:
file_path = Path(file_path)
with file_path.open("rb") as file:
data = tomllib.load(file)
def process_dict(obj, current_prefix=""):
@@ -55,8 +59,8 @@ def find_duplicate_keys(file_path, keys=None, prefix=""):
return duplicates
# Maximum size for TOML files (e.g., 500 KB)
MAX_FILE_SIZE = 500 * 1024
# Maximum size for TOML files (e.g., 570 KB)
MAX_FILE_SIZE = 570 * 1024
def parse_toml_file(file_path):
@@ -65,7 +69,8 @@ def parse_toml_file(file_path):
:param file_path: Path to the TOML file.
:return: Dictionary with flattened keys.
"""
with open(file_path, "rb") as file:
file_path = Path(file_path)
with file_path.open("rb") as file:
data = tomllib.load(file)
def flatten_dict(d, parent_key="", sep="."):
@@ -108,7 +113,8 @@ def write_toml_file(file_path, updated_properties):
"""
nested_data = unflatten_dict(updated_properties)
with open(file_path, "wb") as file:
file_path = Path(file_path)
with file_path.open("wb") as file:
tomli_w.dump(nested_data, file)
@@ -119,18 +125,23 @@ def update_missing_keys(reference_file, file_list, branch=""):
:param file_list: List of translation files to update.
:param branch: Branch where the files are located.
"""
reference_file = Path(reference_file)
reference_properties = parse_toml_file(reference_file)
branch_path = Path(branch) if branch else Path()
for file_path in file_list:
basename_current_file = os.path.basename(os.path.join(branch, file_path))
file_path = Path(file_path)
language_dir = file_path.parent.name
reference_lang_dir = reference_file.parent.name
if (
basename_current_file == os.path.basename(reference_file)
or not file_path.endswith(".toml")
or not os.path.dirname(file_path).endswith("locales")
language_dir == reference_lang_dir
or file_path.suffix != ".toml"
or file_path.parents[1].name != "locales"
):
print(f"Skipping file: {file_path}")
continue
current_properties = parse_toml_file(os.path.join(branch, file_path))
current_properties = parse_toml_file(branch_path / file_path)
updated_properties = {}
for ref_key, ref_value in reference_properties.items():
@@ -141,7 +152,7 @@ def update_missing_keys(reference_file, file_list, branch=""):
# Add missing key with reference value
updated_properties[ref_key] = ref_value
write_toml_file(os.path.join(branch, file_path), updated_properties)
write_toml_file(branch_path / file_path, updated_properties)
def check_for_missing_keys(reference_file, file_list, branch):
@@ -149,14 +160,17 @@ def check_for_missing_keys(reference_file, file_list, branch):
def read_toml_keys(file_path):
if os.path.isfile(file_path) and os.path.exists(file_path):
file_path = Path(file_path)
if file_path.is_file():
return parse_toml_file(file_path)
return {}
def check_for_differences(reference_file, file_list, branch, actor):
reference_branch = branch
basename_reference_file = os.path.basename(reference_file)
reference_file = Path(reference_file)
basename_reference_file = reference_file.name
branch_path = Path(branch) if branch else Path()
report = []
report.append(f"#### 🔄 Reference Branch: `{reference_branch}`")
@@ -170,39 +184,39 @@ def check_for_differences(reference_file, file_list, branch, actor):
if len(file_list) == 1:
file_arr = file_list[0].split()
base_dir = os.path.abspath(
os.path.join(os.getcwd(), "frontend", "public", "locales")
)
base_dir = Path.cwd() / "frontend" / "public" / "locales"
for file_path in file_arr:
file_normpath = os.path.normpath(file_path)
absolute_path = os.path.abspath(file_normpath)
file_path = Path(file_path)
file_normpath = file_path
absolute_path = file_normpath.resolve()
basename_current_file = (branch_path / file_normpath).name
locale_dir = file_normpath.parent.name
report.append(f"#### 📃 **File Check:** `{locale_dir}/{basename_current_file}`")
# Verify that file is within the expected directory
if not absolute_path.startswith(base_dir):
raise ValueError(f"Unsafe file found: {file_normpath}")
if not absolute_path.is_relative_to(base_dir):
has_differences = True
report.append(f"\n⚠️ Unsafe file found: `{locale_dir}/{basename_current_file}`\n\n---\n")
continue
# Verify file size before processing
if os.path.getsize(os.path.join(branch, file_normpath)) > MAX_FILE_SIZE:
raise ValueError(
f"The file {file_normpath} is too large and could pose a security risk."
if (branch_path / file_normpath).stat().st_size > MAX_FILE_SIZE:
has_differences = True
report.append(
f"\n⚠️ The file `{locale_dir}/{basename_current_file}` is too large and could pose a security risk.\n\n---\n"
)
basename_current_file = os.path.basename(os.path.join(branch, file_normpath))
locale_dir = os.path.basename(os.path.dirname(file_normpath))
continue
if basename_current_file == basename_reference_file and locale_dir == "en-GB":
continue
if (
not file_normpath.endswith(".toml")
or basename_current_file != "translation.toml"
):
if file_normpath.suffix != ".toml" or basename_current_file != "translation.toml":
continue
only_reference_file = False
report.append(f"#### 📃 **File Check:** `{locale_dir}/{basename_current_file}`")
current_keys = read_toml_keys(os.path.join(branch, file_path))
current_keys = read_toml_keys(branch_path / file_path)
reference_key_count = len(reference_keys)
current_key_count = len(current_keys)
@@ -247,13 +261,13 @@ def check_for_differences(reference_file, file_list, branch, actor):
else:
report.append("2. **Test Status:** ✅ **_Passed_**")
if find_duplicate_keys(os.path.join(branch, file_normpath)):
if find_duplicate_keys(branch_path / file_normpath):
has_differences = True
output = "\n".join(
[
f" - `{key}`: first at {first}, duplicate at `{duplicate}`"
for key, first, duplicate in find_duplicate_keys(
os.path.join(branch, file_normpath)
branch_path / file_normpath
)
]
)