mirror of
https://github.com/Frooodle/Stirling-PDF.git
synced 2026-02-17 13:52:14 +01:00
V1 merge (#5193)
# Description of Changes <!-- Please provide a summary of the changes, including: - What was changed - Why the change was made - Any challenges encountered Closes #(issue_number) --> --- ## Checklist ### General - [ ] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [ ] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [ ] I have performed a self-review of my own code - [ ] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### UI Changes (if applicable) - [ ] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [ ] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details. --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Balázs Szücs <bszucs1209@gmail.com> Signed-off-by: stirlingbot[bot] <stirlingbot[bot]@users.noreply.github.com> Co-authored-by: ConnorYoh <40631091+ConnorYoh@users.noreply.github.com> Co-authored-by: Connor Yoh <connor@stirlingpdf.com> Co-authored-by: OUNZAR Aymane <aymane.ounzar@imt-atlantique.net> Co-authored-by: YAOU Reda <yaoureda24@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: stirlingbot[bot] <195170888+stirlingbot[bot]@users.noreply.github.com> Co-authored-by: Balázs Szücs <127139797+balazs-szucs@users.noreply.github.com> Co-authored-by: Ludy <Ludy87@users.noreply.github.com> Co-authored-by: tkymmm <136296842+tkymmm@users.noreply.github.com> Co-authored-by: Peter Dave Hello <hsu@peterdavehello.org> Co-authored-by: albanobattistella <34811668+albanobattistella@users.noreply.github.com> Co-authored-by: PingLin8888 <88387490+PingLin8888@users.noreply.github.com> Co-authored-by: FdaSilvaYY <FdaSilvaYY@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: OteJlo <106060728+OteJlo@users.noreply.github.com> Co-authored-by: Angel <41905618+TheShadowAngel@users.noreply.github.com> Co-authored-by: Ricardo Catarino <ricardomicc@gmail.com> Co-authored-by: Luis Antonio Argüelles González <luis.arguelles@encora.com> Co-authored-by: Dawid Urbański <31166488+urbaned121@users.noreply.github.com> Co-authored-by: Stephan Paternotte <Stephan-P@users.noreply.github.com> Co-authored-by: Leonardo Santos Paulucio <leonardo.paulucio@hotmail.com> Co-authored-by: hamza khalem <72972114+hamzakhalem@users.noreply.github.com> Co-authored-by: IT Creativity + Art Team <admin@it-playground.net> Co-authored-by: Reece Browne <74901996+reecebrowne@users.noreply.github.com> Co-authored-by: James Brunton <jbrunton96@gmail.com> Co-authored-by: Victor Villarreal <133383186+vvillarreal-cfee@users.noreply.github.com>
This commit is contained in:
@@ -1,21 +1,56 @@
|
||||
"""A script to update language progress status in README.md based on
|
||||
TOML translation file comparison.
|
||||
"""
|
||||
A script to update language progress status in README.md based on
|
||||
properties file comparison.
|
||||
|
||||
This script compares the default translation TOML file with others in the locales directory to
|
||||
determine language progress.
|
||||
It then updates README.md based on provided progress list.
|
||||
This script compares the default (reference) properties file, usually
|
||||
`messages_en_GB.properties`, with other translation files in the
|
||||
`app/core/src/main/resources/` directory.
|
||||
It determines how many lines are fully translated and automatically updates
|
||||
progress badges in the `README.md`.
|
||||
|
||||
Additionally, it maintains a TOML configuration file
|
||||
(`scripts/ignore_translation.toml`) that defines which keys are ignored
|
||||
during comparison (e.g., static values like `language.direction`).
|
||||
|
||||
Author: Ludy87
|
||||
Updated for TOML format
|
||||
|
||||
Example:
|
||||
To use this script, simply run it from command line:
|
||||
$ python counter_translation_v3.py
|
||||
""" # noqa: D205
|
||||
Usage:
|
||||
Run this script directly from the project root.
|
||||
|
||||
# --- Compare all translation files and update README.md ---
|
||||
$ python scripts/counter_translation.py
|
||||
|
||||
This will:
|
||||
• Compare all files matching messages_*.properties
|
||||
• Update progress badges in README.md
|
||||
• Update/format ignore_translation.toml automatically
|
||||
|
||||
# --- Check a single language file ---
|
||||
$ python scripts/counter_translation.py --lang messages_fr_FR.properties
|
||||
|
||||
This will:
|
||||
• Compare the French translation file against the English reference
|
||||
• Print the translation percentage in the console
|
||||
|
||||
# --- Print ONLY the percentage (for CI pipelines or automation) ---
|
||||
$ python scripts/counter_translation.py --lang messages_fr_FR.properties --show-percentage
|
||||
|
||||
Example output:
|
||||
87
|
||||
|
||||
Arguments:
|
||||
-l, --lang <file> Specific properties file to check
|
||||
(relative or absolute path).
|
||||
--show-percentage Print only the percentage (no formatting, ideal for CI/CD).
|
||||
--show-missing-keys Show the list of missing keys when checking a single language file.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import glob
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from typing import Iterable
|
||||
|
||||
import tomlkit
|
||||
import tomlkit.toml_file
|
||||
@@ -23,14 +58,15 @@ import tomlkit.toml_file
|
||||
|
||||
def convert_to_multiline(data: tomlkit.TOMLDocument) -> tomlkit.TOMLDocument:
|
||||
"""Converts 'ignore' and 'missing' arrays to multiline arrays and sorts the first-level keys of the TOML document.
|
||||
|
||||
Enhances readability and consistency in the TOML file by ensuring arrays contain unique and sorted entries.
|
||||
|
||||
Parameters:
|
||||
Args:
|
||||
data (tomlkit.TOMLDocument): The original TOML document containing the data.
|
||||
|
||||
Returns:
|
||||
tomlkit.TOMLDocument: A new TOML document with sorted keys and properly formatted arrays.
|
||||
""" # noqa: D205
|
||||
"""
|
||||
sorted_data = tomlkit.document()
|
||||
for key in sorted(data.keys()):
|
||||
value = data[key]
|
||||
@@ -53,16 +89,19 @@ def convert_to_multiline(data: tomlkit.TOMLDocument) -> tomlkit.TOMLDocument:
|
||||
|
||||
|
||||
def write_readme(progress_list: list[tuple[str, int]]) -> None:
|
||||
"""Updates the progress status in the README.md file based
|
||||
on the provided progress list.
|
||||
"""Updates the progress status in the README.md file based on the provided progress list.
|
||||
|
||||
Parameters:
|
||||
This function reads the existing README.md content, identifies lines containing
|
||||
language-specific progress badges, and replaces the percentage values and URLs
|
||||
with the new progress data.
|
||||
|
||||
Args:
|
||||
progress_list (list[tuple[str, int]]): A list of tuples containing
|
||||
language and progress percentage.
|
||||
language codes (e.g., 'fr_FR') and progress percentages (integers from 0 to 100).
|
||||
|
||||
Returns:
|
||||
None
|
||||
""" # noqa: D205
|
||||
"""
|
||||
with open("README.md", encoding="utf-8") as file:
|
||||
content = file.readlines()
|
||||
|
||||
@@ -80,70 +119,111 @@ def write_readme(progress_list: list[tuple[str, int]]) -> None:
|
||||
file.writelines(content)
|
||||
|
||||
|
||||
def parse_toml_file(file_path):
|
||||
"""
|
||||
Parses a TOML translation file and returns a flat dictionary of all keys.
|
||||
:param file_path: Path to the TOML file.
|
||||
:return: Dictionary with flattened keys and values.
|
||||
"""
|
||||
with open(file_path, "r", encoding="utf-8") as file:
|
||||
data = tomlkit.parse(file.read())
|
||||
def load_reference_keys(default_file_path: str) -> set[str]:
|
||||
"""Reads all keys from the reference properties file (excluding comments and empty lines).
|
||||
|
||||
def flatten_dict(d, parent_key="", sep="."):
|
||||
items = {}
|
||||
for k, v in d.items():
|
||||
new_key = f"{parent_key}{sep}{k}" if parent_key else k
|
||||
if isinstance(v, dict):
|
||||
items.update(flatten_dict(v, new_key, sep=sep))
|
||||
else:
|
||||
items[new_key] = v
|
||||
return items
|
||||
This function skips the first 5 lines (assumed to be headers or metadata) and then
|
||||
extracts keys from lines containing '=' separators, ignoring comments (#) and empty lines.
|
||||
It also handles potential BOM (Byte Order Mark) characters.
|
||||
|
||||
return flatten_dict(data)
|
||||
Args:
|
||||
default_file_path (str): The path to the default (reference) properties file.
|
||||
|
||||
Returns:
|
||||
set[str]: A set of unique keys found in the reference file.
|
||||
"""
|
||||
keys: set[str] = set()
|
||||
with open(default_file_path, encoding="utf-8") as f:
|
||||
# Skip the first 5 lines (headers)
|
||||
for _ in range(5):
|
||||
try:
|
||||
next(f)
|
||||
except StopIteration:
|
||||
break
|
||||
|
||||
for line in f:
|
||||
s = line.strip()
|
||||
if not s or s.startswith("#") or "=" not in s:
|
||||
continue
|
||||
k, _ = s.split("=", 1)
|
||||
keys.add(k.strip().replace("\ufeff", "")) # BOM protection
|
||||
return keys
|
||||
|
||||
|
||||
def _lang_from_path(file_path: str) -> str:
|
||||
"""Extracts the language code from a properties file path.
|
||||
|
||||
Assumes the filename format is 'messages_<language>.properties', where <language>
|
||||
is the code like 'fr_FR'.
|
||||
|
||||
Args:
|
||||
file_path (str): The full path to the properties file.
|
||||
|
||||
Returns:
|
||||
str: The extracted language code.
|
||||
"""
|
||||
return (
|
||||
os.path.basename(file_path).split("messages_", 1)[1].split(".properties", 1)[0]
|
||||
)
|
||||
|
||||
|
||||
def compare_files(
|
||||
default_file_path, file_paths, ignore_translation_file
|
||||
default_file_path: str,
|
||||
file_paths: Iterable[str],
|
||||
ignore_translation_file: str,
|
||||
show_missing_keys: bool = False,
|
||||
show_percentage: bool = False,
|
||||
) -> list[tuple[str, int]]:
|
||||
"""Compares the default TOML translation file with other
|
||||
translation files in the locales directory.
|
||||
"""Compares the default properties file with other properties files in the directory.
|
||||
|
||||
Parameters:
|
||||
default_file_path (str): The path to the default translation TOML file.
|
||||
file_paths (list): List of paths to translation TOML files.
|
||||
ignore_translation_file (str): Path to the TOML file with ignore rules.
|
||||
This function calculates translation progress for each language file by comparing
|
||||
keys and values line-by-line, skipping headers. It accounts for ignored keys defined
|
||||
in a TOML configuration file and updates that file with cleaned ignore lists.
|
||||
English variants (en_GB, en_US) are hardcoded to 100% progress.
|
||||
|
||||
Args:
|
||||
default_file_path (str): The path to the default properties file (reference).
|
||||
file_paths (Iterable[str]): Iterable of paths to properties files to compare.
|
||||
ignore_translation_file (str): Path to the TOML file with ignore/missing configurations per language.
|
||||
show_missing_keys (bool, optional): If True, prints the list of missing keys for each file. Defaults to False.
|
||||
show_percentage (bool, optional): If True, suppresses detailed output and focuses on percentage calculation. Defaults to False.
|
||||
|
||||
Returns:
|
||||
list[tuple[str, int]]: A list of tuples containing
|
||||
language and progress percentage.
|
||||
""" # noqa: D205
|
||||
default_keys = parse_toml_file(default_file_path)
|
||||
num_keys = len(default_keys)
|
||||
list[tuple[str, int]]: A sorted list of tuples containing language codes and progress percentages
|
||||
(descending order by percentage). Duplicates are removed.
|
||||
"""
|
||||
# Count total translatable lines in reference (excluding empty and comments)
|
||||
num_lines = sum(
|
||||
1
|
||||
for line in open(default_file_path, encoding="utf-8")
|
||||
if line.strip() and not line.strip().startswith("#")
|
||||
)
|
||||
|
||||
result_list = []
|
||||
ref_keys: set[str] = load_reference_keys(default_file_path)
|
||||
|
||||
result_list: list[tuple[str, int]] = []
|
||||
sort_ignore_translation: tomlkit.TOMLDocument
|
||||
|
||||
# read toml
|
||||
with open(ignore_translation_file, encoding="utf-8") as f:
|
||||
sort_ignore_translation = tomlkit.parse(f.read())
|
||||
# Read or initialize TOML config
|
||||
if os.path.exists(ignore_translation_file):
|
||||
with open(ignore_translation_file, encoding="utf-8") as f:
|
||||
sort_ignore_translation = tomlkit.parse(f.read())
|
||||
else:
|
||||
sort_ignore_translation = tomlkit.document()
|
||||
|
||||
for file_path in file_paths:
|
||||
# Extract language code from directory name
|
||||
locale_dir = os.path.basename(os.path.dirname(file_path))
|
||||
language = _lang_from_path(file_path)
|
||||
|
||||
# Convert locale format from hyphen to underscore for TOML compatibility
|
||||
# e.g., en-GB -> en_GB, sr-LATN-RS -> sr_LATN_RS
|
||||
language = locale_dir.replace("-", "_")
|
||||
|
||||
fails = 0
|
||||
if language in ["en_GB", "en_US"]:
|
||||
result_list.append(("en_GB", 100))
|
||||
result_list.append(("en_US", 100))
|
||||
# Hardcode English variants to 100%
|
||||
if "en_GB" in language or "en_US" in language:
|
||||
result_list.append((language, 100))
|
||||
continue
|
||||
|
||||
# Initialize language table in TOML if missing
|
||||
if language not in sort_ignore_translation:
|
||||
sort_ignore_translation[language] = tomlkit.table()
|
||||
|
||||
# Ensure default ignore list if empty
|
||||
if (
|
||||
"ignore" not in sort_ignore_translation[language]
|
||||
or len(sort_ignore_translation[language].get("ignore", [])) < 1
|
||||
@@ -152,53 +232,182 @@ def compare_files(
|
||||
["language.direction"]
|
||||
)
|
||||
|
||||
current_keys = parse_toml_file(file_path)
|
||||
# Clean up ignore list to only include keys present in reference
|
||||
sort_ignore_translation[language]["ignore"] = [
|
||||
key
|
||||
for key in sort_ignore_translation[language]["ignore"]
|
||||
if key in ref_keys or key == "language.direction"
|
||||
]
|
||||
|
||||
# Compare keys
|
||||
for default_key, default_value in default_keys.items():
|
||||
if default_key not in current_keys:
|
||||
# Key is missing entirely
|
||||
if default_key not in sort_ignore_translation[language]["ignore"]:
|
||||
print(f"{language}: Key '{default_key}' is missing.")
|
||||
fails += 1
|
||||
elif (
|
||||
default_value == current_keys[default_key]
|
||||
and default_key not in sort_ignore_translation[language]["ignore"]
|
||||
fails = 0
|
||||
missing_str_keys: list[str] = []
|
||||
with (
|
||||
open(default_file_path, encoding="utf-8") as default_file,
|
||||
open(file_path, encoding="utf-8") as file,
|
||||
):
|
||||
# Skip headers (first 5 lines) in both files
|
||||
for _ in range(5):
|
||||
next(default_file)
|
||||
try:
|
||||
next(file)
|
||||
except StopIteration:
|
||||
fails = num_lines
|
||||
break
|
||||
|
||||
for line_num, (line_default, line_file) in enumerate(
|
||||
zip(default_file, file), start=6
|
||||
):
|
||||
# Key exists but value is untranslated (same as reference)
|
||||
print(f"{language}: Key '{default_key}' is missing the translation.")
|
||||
fails += 1
|
||||
elif default_value != current_keys[default_key]:
|
||||
# Key is translated, remove from ignore list if present
|
||||
if default_key in sort_ignore_translation[language]["ignore"]:
|
||||
sort_ignore_translation[language]["ignore"].remove(default_key)
|
||||
try:
|
||||
# Ignoring empty lines and lines starting with #
|
||||
if line_default.strip() == "" or line_default.startswith("#"):
|
||||
continue
|
||||
|
||||
default_key, default_value = line_default.split("=", 1)
|
||||
file_key, file_value = line_file.split("=", 1)
|
||||
default_key = default_key.strip()
|
||||
default_value = default_value.strip()
|
||||
file_key = file_key.strip()
|
||||
file_value = file_value.strip()
|
||||
|
||||
if (
|
||||
default_value == file_value
|
||||
and default_key
|
||||
not in sort_ignore_translation[language]["ignore"]
|
||||
):
|
||||
# Missing translation (same as default and not ignored)
|
||||
fails += 1
|
||||
missing_str_keys.append(default_key)
|
||||
if default_value != file_value:
|
||||
if default_key in sort_ignore_translation[language]["ignore"]:
|
||||
# Remove from ignore if actually translated
|
||||
sort_ignore_translation[language]["ignore"].remove(
|
||||
default_key
|
||||
)
|
||||
except ValueError as e:
|
||||
print(f"Error processing line {line_num} in {file_path}: {e}")
|
||||
print(f"{line_default}|{line_file}")
|
||||
sys.exit(1)
|
||||
except IndexError:
|
||||
# Handle mismatched line counts
|
||||
fails += 1
|
||||
continue
|
||||
|
||||
if show_missing_keys:
|
||||
if len(missing_str_keys) > 0:
|
||||
print(f" Missing keys: {missing_str_keys}")
|
||||
else:
|
||||
print(" No missing keys!")
|
||||
|
||||
if not show_percentage:
|
||||
print(f"{language}: {fails} out of {num_lines} lines are not translated.")
|
||||
|
||||
print(f"{language}: {fails} out of {num_keys} keys are not translated.")
|
||||
result_list.append(
|
||||
(
|
||||
language,
|
||||
int((num_keys - fails) * 100 / num_keys),
|
||||
int((num_lines - fails) * 100 / num_lines),
|
||||
)
|
||||
)
|
||||
|
||||
# Write cleaned and formatted TOML back
|
||||
ignore_translation = convert_to_multiline(sort_ignore_translation)
|
||||
with open(ignore_translation_file, "w", encoding="utf-8", newline="\n") as file:
|
||||
file.write(tomlkit.dumps(ignore_translation))
|
||||
|
||||
# Remove duplicates and sort by percentage descending
|
||||
unique_data = list(set(result_list))
|
||||
unique_data.sort(key=lambda x: x[1], reverse=True)
|
||||
|
||||
return unique_data
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
directory = os.path.join(os.getcwd(), "frontend", "public", "locales")
|
||||
translation_file_paths = glob.glob(os.path.join(directory, "*", "translation.toml"))
|
||||
reference_file = os.path.join(directory, "en-GB", "translation.toml")
|
||||
def main() -> None:
|
||||
"""Main entry point for the script.
|
||||
|
||||
scripts_directory = os.path.join(os.getcwd(), "scripts")
|
||||
Parses command-line arguments and either processes a single language file
|
||||
(with optional percentage output) or all files and updates the README.md.
|
||||
|
||||
Command-line options:
|
||||
--lang, -l <file>: Specific properties file to check (e.g., 'messages_fr_FR.properties').
|
||||
--show-percentage: Print only the translation percentage for --lang and exit.
|
||||
--show-missing-keys: Show the list of missing keys when checking a single language file.
|
||||
"""
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Compare i18n property files and optionally update README badges."
|
||||
)
|
||||
parser.add_argument(
|
||||
"--lang",
|
||||
"-l",
|
||||
help=(
|
||||
"Specific properties file to check, e.g. 'messages_fr_FR.properties'. "
|
||||
"If a relative filename is given, it is resolved against the resources directory."
|
||||
),
|
||||
)
|
||||
parser.add_argument(
|
||||
"--show-percentage",
|
||||
"-sp",
|
||||
action="store_true",
|
||||
help="Print ONLY the translation percentage for --lang and exit.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--show-missing-keys",
|
||||
"-smk",
|
||||
action="store_true",
|
||||
help="Show the list of missing keys when checking a single language file.",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Project layout assumptions
|
||||
cwd = os.getcwd()
|
||||
resources_dir = os.path.join(cwd, "app", "core", "src", "main", "resources")
|
||||
reference_file = os.path.join(resources_dir, "messages_en_GB.properties")
|
||||
scripts_directory = os.path.join(cwd, "scripts")
|
||||
translation_state_file = os.path.join(scripts_directory, "ignore_translation.toml")
|
||||
|
||||
write_readme(
|
||||
compare_files(reference_file, translation_file_paths, translation_state_file)
|
||||
if args.lang:
|
||||
# Resolve provided path
|
||||
lang_input = args.lang
|
||||
if os.path.isabs(lang_input) or os.path.exists(lang_input):
|
||||
lang_file = lang_input
|
||||
else:
|
||||
lang_file = os.path.join(resources_dir, lang_input)
|
||||
|
||||
if not os.path.exists(lang_file):
|
||||
print(f"ERROR: Could not find language file: {lang_file}")
|
||||
sys.exit(2)
|
||||
|
||||
results = compare_files(
|
||||
reference_file,
|
||||
[lang_file],
|
||||
translation_state_file,
|
||||
args.show_missing_keys,
|
||||
args.show_percentage,
|
||||
)
|
||||
# Find the exact tuple for the requested language
|
||||
wanted_key = _lang_from_path(lang_file)
|
||||
for lang, pct in results:
|
||||
if lang == wanted_key:
|
||||
if args.show_percentage:
|
||||
# Print ONLY the number
|
||||
print(pct)
|
||||
return
|
||||
else:
|
||||
print(f"{lang}: {pct}% translated")
|
||||
return
|
||||
|
||||
# Fallback (should not happen)
|
||||
print("ERROR: Language not found in results.")
|
||||
sys.exit(3)
|
||||
|
||||
# Default behavior (no --lang): process all and update README
|
||||
messages_file_paths = glob.glob(
|
||||
os.path.join(resources_dir, "messages_*.properties")
|
||||
)
|
||||
progress = compare_files(
|
||||
reference_file, messages_file_paths, translation_state_file
|
||||
)
|
||||
write_readme(progress)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
@@ -29,6 +29,11 @@ if /I not "%confirm%"=="Y" (
|
||||
|
||||
echo Starting generation...
|
||||
|
||||
echo Generating .github\scripts\requirements_dev.txt
|
||||
pip-compile --allow-unsafe --generate-hashes --upgrade --strip-extras ^
|
||||
--output-file=".github\scripts\requirements_dev.txt" ^
|
||||
".github\scripts\requirements_dev.in"
|
||||
|
||||
echo Generating .github\scripts\requirements_pre_commit.txt
|
||||
pip-compile --generate-hashes --upgrade --strip-extras ^
|
||||
--output-file=".github\scripts\requirements_pre_commit.txt" ^
|
||||
|
||||
@@ -3,6 +3,7 @@ ignore = [
|
||||
'lang.div',
|
||||
'lang.dzo',
|
||||
'lang.que',
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
[az_AZ]
|
||||
@@ -36,10 +37,7 @@ ignore = [
|
||||
|
||||
[bg_BG]
|
||||
ignore = [
|
||||
'lang.div',
|
||||
'lang.dzo',
|
||||
'lang.iku',
|
||||
'lang.que',
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
@@ -50,6 +48,7 @@ ignore = [
|
||||
|
||||
[ca_CA]
|
||||
ignore = [
|
||||
'adminUserSettings.admin',
|
||||
'lang.amh',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
@@ -190,7 +189,8 @@ ignore = [
|
||||
ignore = [
|
||||
'AddStampRequest.alphabet',
|
||||
'AddStampRequest.position',
|
||||
'PDFToBook.selectText.1',
|
||||
'PDFToText.tags',
|
||||
'addPageNumbers.selectText.3',
|
||||
'adminUserSettings.team',
|
||||
'alphabet',
|
||||
'audit.dashboard.modal.id',
|
||||
@@ -200,6 +200,7 @@ ignore = [
|
||||
'audit.dashboard.table.details',
|
||||
'audit.dashboard.table.id',
|
||||
'certSign.name',
|
||||
'cookieBanner.popUp.acceptAllBtn',
|
||||
'endpointStatistics.top10',
|
||||
'endpointStatistics.top20',
|
||||
'fileChooser.dragAndDrop',
|
||||
@@ -236,11 +237,11 @@ ignore = [
|
||||
'pipelineOptions.pipelineHeader',
|
||||
'pro',
|
||||
'redact.zoom',
|
||||
'scannerEffect.quality.medium',
|
||||
'sponsor',
|
||||
'team.status',
|
||||
'text',
|
||||
'update.version',
|
||||
'validateSignature.cert.bits',
|
||||
'validateSignature.cert.version',
|
||||
'validateSignature.status',
|
||||
'watermark.type.1',
|
||||
@@ -262,11 +263,17 @@ ignore = [
|
||||
|
||||
[es_ES]
|
||||
ignore = [
|
||||
'audit.dashboard.export.csv',
|
||||
'audit.dashboard.export.json',
|
||||
'audit.dashboard.modal.id',
|
||||
'audit.dashboard.table.id',
|
||||
'error',
|
||||
'fileChooser.click',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
'lang.div',
|
||||
'lang.dzo',
|
||||
'lang.epo',
|
||||
'lang.fil',
|
||||
'lang.guj',
|
||||
'lang.iku',
|
||||
@@ -274,6 +281,7 @@ ignore = [
|
||||
'lang.lao',
|
||||
'lang.mal',
|
||||
'lang.ori',
|
||||
'lang.que',
|
||||
'lang.snd',
|
||||
'lang.tam',
|
||||
'lang.tel',
|
||||
@@ -281,7 +289,12 @@ ignore = [
|
||||
'lang.yor',
|
||||
'language.direction',
|
||||
'no',
|
||||
'pro',
|
||||
'redact.zoom',
|
||||
'scannerEffect.colorspace.color',
|
||||
'showJS.tags',
|
||||
'update.priority.normal',
|
||||
'validateSignature.cert.bits',
|
||||
]
|
||||
|
||||
[eu_ES]
|
||||
@@ -307,50 +320,88 @@ ignore = [
|
||||
]
|
||||
|
||||
[fa_IR]
|
||||
ignore = []
|
||||
ignore = [
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
[fr_FR]
|
||||
ignore = [
|
||||
'AddStampRequest.alphabet',
|
||||
'AddStampRequest.position',
|
||||
'AddStampRequest.rotation',
|
||||
'PDFToBook.selectText.1',
|
||||
'addPageNumbers.selectText.3',
|
||||
'adminUserSettings.actions',
|
||||
'alphabet',
|
||||
'audit.dashboard.modal.id',
|
||||
'audit.dashboard.modal.type',
|
||||
'audit.dashboard.pagination.pageInfo1',
|
||||
'audit.dashboard.table.id',
|
||||
'audit.dashboard.table.type',
|
||||
'compare.document.1',
|
||||
'compare.document.2',
|
||||
'cookieBanner.preferencesModal.analytics.posthog.label',
|
||||
'cookieBanner.preferencesModal.analytics.scarf.label',
|
||||
'cookieBanner.preferencesModal.serviceCounterLabel',
|
||||
'endpointStatistics.top',
|
||||
'endpointStatistics.top10',
|
||||
'endpointStatistics.top20',
|
||||
'home.pipeline.title',
|
||||
'lang.afr',
|
||||
'lang.ben',
|
||||
'lang.bre',
|
||||
'lang.cat',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
'lang.div',
|
||||
'lang.dzo',
|
||||
'lang.eus',
|
||||
'lang.guj',
|
||||
'lang.hin',
|
||||
'lang.iku',
|
||||
'lang.kan',
|
||||
'lang.kaz',
|
||||
'lang.khm',
|
||||
'lang.lao',
|
||||
'lang.ltz',
|
||||
'lang.lat',
|
||||
'lang.mal',
|
||||
'lang.mar',
|
||||
'lang.mri',
|
||||
'lang.oci',
|
||||
'lang.ori',
|
||||
'lang.osd',
|
||||
'lang.pan',
|
||||
'lang.pus',
|
||||
'lang.que',
|
||||
'lang.san',
|
||||
'lang.snd',
|
||||
'lang.swa',
|
||||
'lang.tel',
|
||||
'lang.tam',
|
||||
'lang.tat',
|
||||
'lang.tgl',
|
||||
'lang.tir',
|
||||
'lang.yid',
|
||||
'lang.yor',
|
||||
'language.direction',
|
||||
'licenses.license',
|
||||
'licenses.module',
|
||||
'licenses.nav',
|
||||
'licenses.version',
|
||||
'multiTool.page',
|
||||
'page',
|
||||
'pages',
|
||||
'pdfOrganiser.mode',
|
||||
'pipeline.title',
|
||||
'pro',
|
||||
'redact.pageRedactionNumbers.title',
|
||||
'redact.zoom',
|
||||
'showJS.tags',
|
||||
'split.desc.3',
|
||||
'split.desc.6',
|
||||
'split.desc.7',
|
||||
'split.desc.8',
|
||||
'update.version',
|
||||
'validateSignature.cert.bits',
|
||||
'validateSignature.cert.version',
|
||||
'validateSignature.date',
|
||||
'validateSignature.signature',
|
||||
'watermark.type.2',
|
||||
]
|
||||
|
||||
@@ -384,7 +435,6 @@ ignore = [
|
||||
|
||||
[hr_HR]
|
||||
ignore = [
|
||||
'PDFToBook.selectText.1',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
'lang.dzo',
|
||||
@@ -400,13 +450,14 @@ ignore = [
|
||||
|
||||
[hu_HU]
|
||||
ignore = [
|
||||
'audit.dashboard.export.json',
|
||||
'audit.dashboard.modal.id',
|
||||
'audit.dashboard.table.id',
|
||||
'endpointStatistics.top10',
|
||||
'endpointStatistics.top20',
|
||||
'home.pipeline.title',
|
||||
'language.direction',
|
||||
'pipeline.title',
|
||||
'pipelineOptions.pipelineHeader',
|
||||
'pro',
|
||||
'showJS.tags',
|
||||
]
|
||||
@@ -515,11 +566,6 @@ ignore = [
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
[ml_ML]
|
||||
ignore = [
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
[nl_NL]
|
||||
ignore = [
|
||||
'compare.document.1',
|
||||
@@ -556,13 +602,11 @@ ignore = [
|
||||
'lang.urd',
|
||||
'lang.yor',
|
||||
'language.direction',
|
||||
'navbar.allTools',
|
||||
'sponsor',
|
||||
]
|
||||
|
||||
[no_NB]
|
||||
ignore = [
|
||||
'PDFToBook.selectText.1',
|
||||
'adminUserSettings.admin',
|
||||
'info',
|
||||
'lang.afr',
|
||||
@@ -609,12 +653,12 @@ ignore = [
|
||||
'lang.urd',
|
||||
'lang.yor',
|
||||
'language.direction',
|
||||
'oops',
|
||||
'sponsor',
|
||||
]
|
||||
|
||||
[pl_PL]
|
||||
ignore = [
|
||||
'PDFToBook.selectText.1',
|
||||
'lang.afr',
|
||||
'lang.bre',
|
||||
'lang.ceb',
|
||||
@@ -684,6 +728,12 @@ ignore = [
|
||||
|
||||
[pt_PT]
|
||||
ignore = [
|
||||
'audit.dashboard.table.id',
|
||||
'endpointStatistics.endpoint',
|
||||
'endpointStatistics.login',
|
||||
'endpointStatistics.top',
|
||||
'endpointStatistics.top10',
|
||||
'endpointStatistics.top20',
|
||||
'lang.bre',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
@@ -710,6 +760,8 @@ ignore = [
|
||||
'lang.uzb',
|
||||
'lang.yid',
|
||||
'language.direction',
|
||||
'pro',
|
||||
'update.priority.normal',
|
||||
]
|
||||
|
||||
[ro_RO]
|
||||
@@ -763,6 +815,7 @@ ignore = [
|
||||
[sk_SK]
|
||||
ignore = [
|
||||
'adminUserSettings.admin',
|
||||
'home.multiTool.title',
|
||||
'info',
|
||||
'lang.ceb',
|
||||
'lang.chr',
|
||||
@@ -784,6 +837,7 @@ ignore = [
|
||||
'lang.urd',
|
||||
'lang.uzb',
|
||||
'language.direction',
|
||||
'navbar.sections.security',
|
||||
'text',
|
||||
'watermark.type.1',
|
||||
]
|
||||
@@ -833,6 +887,7 @@ ignore = [
|
||||
'endpointStatistics.top10',
|
||||
'endpointStatistics.top20',
|
||||
'font',
|
||||
'info',
|
||||
'lang.div',
|
||||
'lang.epo',
|
||||
'lang.hin',
|
||||
@@ -890,6 +945,7 @@ ignore = [
|
||||
'lang.tir',
|
||||
'lang.uzb_cyrl',
|
||||
'language.direction',
|
||||
'pipelineOptions.pipelineHeader',
|
||||
'showJS.tags',
|
||||
]
|
||||
|
||||
@@ -963,15 +1019,11 @@ ignore = [
|
||||
'lang.yid',
|
||||
'lang.yor',
|
||||
'language.direction',
|
||||
'pipeline.title',
|
||||
'pipelineOptions.pipelineHeader',
|
||||
'showJS.tags',
|
||||
]
|
||||
|
||||
[zh_BO]
|
||||
ignore = [
|
||||
'language.direction',
|
||||
]
|
||||
|
||||
[zh_CN]
|
||||
ignore = [
|
||||
'language.direction',
|
||||
@@ -980,5 +1032,6 @@ ignore = [
|
||||
[zh_TW]
|
||||
ignore = [
|
||||
'language.direction',
|
||||
'poweredBy',
|
||||
'showJS.tags',
|
||||
]
|
||||
|
||||
@@ -1,41 +1,188 @@
|
||||
#!/bin/bash
|
||||
# This script initializes Stirling PDF without OCR features.
|
||||
set -euo pipefail
|
||||
|
||||
export JAVA_TOOL_OPTIONS="${JAVA_BASE_OPTS} ${JAVA_CUSTOM_OPTS}"
|
||||
echo "running with JAVA_TOOL_OPTIONS ${JAVA_BASE_OPTS} ${JAVA_CUSTOM_OPTS}"
|
||||
log() { printf '%s\n' "$*" >&2; }
|
||||
command_exists() { command -v "$1" >/dev/null 2>&1; }
|
||||
|
||||
# Update the user and group IDs as per environment variables
|
||||
if [ ! -z "$PUID" ] && [ "$PUID" != "$(id -u stirlingpdfuser)" ]; then
|
||||
usermod -o -u "$PUID" stirlingpdfuser || true
|
||||
SU_EXEC_BIN=""
|
||||
if command_exists su-exec; then
|
||||
SU_EXEC_BIN="su-exec"
|
||||
elif command_exists gosu; then
|
||||
SU_EXEC_BIN="gosu"
|
||||
fi
|
||||
|
||||
CURRENT_USER="$(id -un)"
|
||||
CURRENT_UID="$(id -u)"
|
||||
SWITCH_USER_WARNING_EMITTED=false
|
||||
|
||||
if [ ! -z "$PGID" ] && [ "$PGID" != "$(getent group stirlingpdfgroup | cut -d: -f3)" ]; then
|
||||
groupmod -o -g "$PGID" stirlingpdfgroup || true
|
||||
fi
|
||||
umask "$UMASK" || true
|
||||
warn_switch_user_once() {
|
||||
if [ "$SWITCH_USER_WARNING_EMITTED" = false ]; then
|
||||
log "WARNING: Unable to switch to user ${RUNTIME_USER:-stirlingpdfuser}; running command as ${CURRENT_USER}."
|
||||
SWITCH_USER_WARNING_EMITTED=true
|
||||
fi
|
||||
}
|
||||
|
||||
if [[ "$INSTALL_BOOK_AND_ADVANCED_HTML_OPS" == "true" && "$FAT_DOCKER" != "true" ]]; then
|
||||
echo "issue with calibre in current version, feature currently disabled on Stirling-PDF"
|
||||
#apk add --no-cache calibre@testing
|
||||
run_as_runtime_user() {
|
||||
if [ "$CURRENT_USER" = "$RUNTIME_USER" ]; then
|
||||
"$@"
|
||||
elif [ "$CURRENT_UID" -eq 0 ] && [ -n "$SU_EXEC_BIN" ]; then
|
||||
"$SU_EXEC_BIN" "$RUNTIME_USER" "$@"
|
||||
else
|
||||
warn_switch_user_once
|
||||
"$@"
|
||||
fi
|
||||
}
|
||||
|
||||
# ---------- VERSION_TAG ----------
|
||||
# Load VERSION_TAG from file if not provided via environment.
|
||||
if [ -z "${VERSION_TAG:-}" ] && [ -f /etc/stirling_version ]; then
|
||||
VERSION_TAG="$(tr -d '\r\n' < /etc/stirling_version)"
|
||||
export VERSION_TAG
|
||||
fi
|
||||
|
||||
# Security jar is now built into the application jar during Docker build
|
||||
# No need to download it separately
|
||||
# ---------- JAVA_OPTS ----------
|
||||
# Configure Java runtime options.
|
||||
export JAVA_TOOL_OPTIONS="${JAVA_BASE_OPTS:-} ${JAVA_CUSTOM_OPTS:-}"
|
||||
export JAVA_TOOL_OPTIONS="-Djava.awt.headless=true ${JAVA_TOOL_OPTIONS}"
|
||||
log "running with JAVA_TOOL_OPTIONS=${JAVA_TOOL_OPTIONS}"
|
||||
log "Running Stirling PDF with DISABLE_ADDITIONAL_FEATURES=${DISABLE_ADDITIONAL_FEATURES:-} and VERSION_TAG=${VERSION_TAG:-<unset>}"
|
||||
|
||||
if [[ -n "$LANGS" ]]; then
|
||||
/scripts/installFonts.sh $LANGS
|
||||
fi
|
||||
# ---------- UMASK ----------
|
||||
# Set default permissions mask.
|
||||
UMASK_VAL="${UMASK:-022}"
|
||||
umask "$UMASK_VAL" 2>/dev/null || umask 022
|
||||
|
||||
echo "Setting permissions and ownership for necessary directories..."
|
||||
# Ensure temp directory exists and has correct permissions
|
||||
mkdir -p /tmp/stirling-pdf || true
|
||||
# Attempt to change ownership of directories and files
|
||||
if chown -R stirlingpdfuser:stirlingpdfgroup $HOME /logs /scripts /usr/share/fonts/opentype/noto /configs /customFiles /pipeline /tmp/stirling-pdf /app.jar; then
|
||||
chmod -R 755 /logs /scripts /usr/share/fonts/opentype/noto /configs /customFiles /pipeline /tmp/stirling-pdf /app.jar || true
|
||||
# If chown succeeds, execute the command as stirlingpdfuser
|
||||
exec su-exec stirlingpdfuser "$@"
|
||||
# ---------- XDG_RUNTIME_DIR ----------
|
||||
# Create the runtime directory, respecting UID/GID settings.
|
||||
RUNTIME_USER="stirlingpdfuser"
|
||||
if id -u "$RUNTIME_USER" >/dev/null 2>&1; then
|
||||
RUID="$(id -u "$RUNTIME_USER")"
|
||||
RGRP="$(id -gn "$RUNTIME_USER")"
|
||||
else
|
||||
# If chown fails, execute the command without changing the user context
|
||||
echo "[WARN] Chown failed, running as host user"
|
||||
exec "$@"
|
||||
RUID="$(id -u)"
|
||||
RGRP="$(id -gn)"
|
||||
RUNTIME_USER="$(id -un)"
|
||||
fi
|
||||
CURRENT_USER="$(id -un)"
|
||||
CURRENT_UID="$(id -u)"
|
||||
|
||||
export XDG_RUNTIME_DIR="/tmp/xdg-${RUID}"
|
||||
mkdir -p "${XDG_RUNTIME_DIR}" || true
|
||||
if [ "$(id -u)" -eq 0 ]; then
|
||||
chown "${RUNTIME_USER}:${RGRP}" "${XDG_RUNTIME_DIR}" 2>/dev/null || true
|
||||
fi
|
||||
chmod 700 "${XDG_RUNTIME_DIR}" 2>/dev/null || true
|
||||
log "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}"
|
||||
|
||||
# ---------- Optional ----------
|
||||
# Disable advanced HTML operations if required.
|
||||
if [[ "${INSTALL_BOOK_AND_ADVANCED_HTML_OPS:-false}" == "true" && "${FAT_DOCKER:-true}" != "true" ]]; then
|
||||
log "issue with calibre in current version, feature currently disabled on Stirling-PDF"
|
||||
fi
|
||||
|
||||
# Download security JAR in non-fat builds.
|
||||
if [[ "${FAT_DOCKER:-true}" != "true" && -x /scripts/download-security-jar.sh ]]; then
|
||||
/scripts/download-security-jar.sh || true
|
||||
fi
|
||||
|
||||
# ---------- UID/GID remap ----------
|
||||
# Remap user/group IDs to match container runtime settings.
|
||||
if [ "$(id -u)" -eq 0 ]; then
|
||||
if id -u stirlingpdfuser >/dev/null 2>&1; then
|
||||
if [ -n "${PUID:-}" ] && [ "$PUID" != "$(id -u stirlingpdfuser)" ]; then
|
||||
usermod -o -u "$PUID" stirlingpdfuser || true
|
||||
chown stirlingpdfuser:stirlingpdfgroup "${XDG_RUNTIME_DIR}" 2>/dev/null || true
|
||||
fi
|
||||
fi
|
||||
if getent group stirlingpdfgroup >/dev/null 2>&1; then
|
||||
if [ -n "${PGID:-}" ] && [ "$PGID" != "$(getent group stirlingpdfgroup | cut -d: -f3)" ]; then
|
||||
groupmod -o -g "$PGID" stirlingpdfgroup || true
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# ---------- Permissions ----------
|
||||
# Ensure required directories exist and set correct permissions.
|
||||
log "Setting permissions..."
|
||||
mkdir -p /tmp/stirling-pdf /logs /configs /customFiles /pipeline || true
|
||||
CHOWN_PATHS=("$HOME" "/logs" "/scripts" "/configs" "/customFiles" "/pipeline" "/tmp/stirling-pdf" "/app.jar")
|
||||
[ -d /usr/share/fonts/truetype ] && CHOWN_PATHS+=("/usr/share/fonts/truetype")
|
||||
CHOWN_OK=true
|
||||
for p in "${CHOWN_PATHS[@]}"; do
|
||||
if [ -e "$p" ]; then
|
||||
chown -R "stirlingpdfuser:stirlingpdfgroup" "$p" 2>/dev/null || CHOWN_OK=false
|
||||
chmod -R 755 "$p" 2>/dev/null || true
|
||||
fi
|
||||
done
|
||||
|
||||
# ---------- Xvfb ----------
|
||||
# Start a virtual framebuffer for GUI-based LibreOffice interactions.
|
||||
if command_exists Xvfb; then
|
||||
log "Starting Xvfb on :99"
|
||||
Xvfb :99 -screen 0 1024x768x24 -ac +extension GLX +render -noreset > /dev/null 2>&1 &
|
||||
export DISPLAY=:99
|
||||
sleep 1
|
||||
else
|
||||
log "Xvfb not installed; skipping virtual display setup"
|
||||
fi
|
||||
|
||||
# ---------- unoserver ----------
|
||||
# Start LibreOffice UNO server for document conversions.
|
||||
UNOSERVER_BIN="$(command -v unoserver || true)"
|
||||
UNOCONVERT_BIN="$(command -v unoconvert || true)"
|
||||
UNOSERVER_PID=""
|
||||
|
||||
if [ -n "$UNOSERVER_BIN" ] && [ -n "$UNOCONVERT_BIN" ]; then
|
||||
LIBREOFFICE_PROFILE="${HOME:-/home/${RUNTIME_USER}}/.libreoffice_uno_${RUID}"
|
||||
run_as_runtime_user mkdir -p "$LIBREOFFICE_PROFILE"
|
||||
|
||||
log "Starting unoserver on 127.0.0.1:2003"
|
||||
run_as_runtime_user "$UNOSERVER_BIN" \
|
||||
--interface 127.0.0.1 \
|
||||
--port 2003 \
|
||||
--uno-port 2004 \
|
||||
&
|
||||
UNOSERVER_PID=$!
|
||||
log "unoserver PID: $UNOSERVER_PID (Profile: $LIBREOFFICE_PROFILE)"
|
||||
|
||||
# Wait until UNO server is ready.
|
||||
log "Waiting for unoserver..."
|
||||
for _ in {1..20}; do
|
||||
if run_as_runtime_user "$UNOCONVERT_BIN" --version >/dev/null 2>&1; then
|
||||
log "unoserver is ready!"
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
if ! run_as_runtime_user "$UNOCONVERT_BIN" --version >/dev/null 2>&1; then
|
||||
log "ERROR: unoserver failed!"
|
||||
if [ -n "$UNOSERVER_PID" ]; then
|
||||
kill "$UNOSERVER_PID" 2>/dev/null || true
|
||||
wait "$UNOSERVER_PID" 2>/dev/null || true
|
||||
fi
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
log "unoserver/unoconvert not installed; skipping UNO setup"
|
||||
fi
|
||||
|
||||
# ---------- Java ----------
|
||||
# Start Stirling PDF Java application.
|
||||
log "Starting Stirling PDF"
|
||||
JAVA_CMD=(
|
||||
java
|
||||
-Dfile.encoding=UTF-8
|
||||
-Djava.io.tmpdir=/tmp/stirling-pdf
|
||||
-jar /app.jar
|
||||
)
|
||||
|
||||
if [ "$CURRENT_USER" = "$RUNTIME_USER" ]; then
|
||||
exec "${JAVA_CMD[@]}"
|
||||
elif [ "$CURRENT_UID" -eq 0 ] && [ -n "$SU_EXEC_BIN" ]; then
|
||||
exec "$SU_EXEC_BIN" "$RUNTIME_USER" "${JAVA_CMD[@]}"
|
||||
else
|
||||
warn_switch_user_once
|
||||
exec "${JAVA_CMD[@]}"
|
||||
fi
|
||||
|
||||
120
scripts/init.sh
120
scripts/init.sh
@@ -1,36 +1,110 @@
|
||||
#!/bin/bash
|
||||
# This script initializes environment variables and paths,
|
||||
# prepares Tesseract data directories, and then runs the main init script.
|
||||
|
||||
# Copy the original tesseract-ocr files to the volume directory without overwriting existing files
|
||||
echo "Copying original files without overwriting existing files"
|
||||
mkdir -p /usr/share/tessdata
|
||||
cp -rn /usr/share/tessdata-original/* /usr/share/tessdata
|
||||
set -euo pipefail
|
||||
|
||||
if [ -d /usr/share/tesseract-ocr/4.00/tessdata ]; then
|
||||
cp -r /usr/share/tesseract-ocr/4.00/tessdata/* /usr/share/tessdata || true;
|
||||
append_env_path() {
|
||||
local target="$1" current="$2" separator=":"
|
||||
if [ -d "$target" ] && [[ ":${current}:" != *":${target}:"* ]]; then
|
||||
if [ -n "$current" ]; then
|
||||
printf '%s' "${target}${separator}${current}"
|
||||
else
|
||||
printf '%s' "${target}"
|
||||
fi
|
||||
else
|
||||
printf '%s' "$current"
|
||||
fi
|
||||
}
|
||||
|
||||
python_site_dir() {
|
||||
local venv_dir="$1"
|
||||
local python_bin="$venv_dir/bin/python"
|
||||
if [ -x "$python_bin" ]; then
|
||||
local py_tag
|
||||
if py_tag="$("$python_bin" -c 'import sys; print(f"python{sys.version_info.major}.{sys.version_info.minor}")' 2>/dev/null)" \
|
||||
&& [ -n "$py_tag" ] \
|
||||
&& [ -d "$venv_dir/lib/$py_tag/site-packages" ]; then
|
||||
printf '%s' "$venv_dir/lib/$py_tag/site-packages"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
# === LD_LIBRARY_PATH ===
|
||||
# Adjust the library path depending on CPU architecture.
|
||||
ARCH=$(uname -m)
|
||||
case "$ARCH" in
|
||||
x86_64)
|
||||
[ -d /usr/lib/x86_64-linux-gnu ] && export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
|
||||
;;
|
||||
aarch64)
|
||||
[ -d /usr/lib/aarch64-linux-gnu ] && export LD_LIBRARY_PATH="/usr/lib/aarch64-linux-gnu${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
|
||||
;;
|
||||
esac
|
||||
|
||||
# Add LibreOffice program directory to library path if available.
|
||||
if [ -d /usr/lib/libreoffice/program ]; then
|
||||
export LD_LIBRARY_PATH="/usr/lib/libreoffice/program${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
|
||||
fi
|
||||
|
||||
# === Python PATH ===
|
||||
# Add virtual environments to PATH and PYTHONPATH.
|
||||
for dir in /opt/venv/bin /opt/unoserver-venv/bin; do
|
||||
PATH="$(append_env_path "$dir" "$PATH")"
|
||||
done
|
||||
export PATH
|
||||
|
||||
PYTHON_PATH_ENTRIES=()
|
||||
for venv in /opt/venv /opt/unoserver-venv; do
|
||||
if [ -d "$venv" ]; then
|
||||
site_dir="$(python_site_dir "$venv")"
|
||||
[ -n "${site_dir:-}" ] && PYTHON_PATH_ENTRIES+=("$site_dir")
|
||||
fi
|
||||
done
|
||||
if [ ${#PYTHON_PATH_ENTRIES[@]} -gt 0 ]; then
|
||||
PYTHONPATH="$(IFS=:; printf '%s' "${PYTHON_PATH_ENTRIES[*]}")${PYTHONPATH:+:$PYTHONPATH}"
|
||||
export PYTHONPATH
|
||||
fi
|
||||
|
||||
# # === tessdata ===
|
||||
# # Prepare Tesseract OCR data directory.
|
||||
REAL_TESSDATA="/usr/share/tesseract-ocr/5/tessdata"
|
||||
SEC_TESSDATA="/usr/share/tessdata"
|
||||
|
||||
log_warn() {
|
||||
echo "[init][warn] $*" >&2
|
||||
}
|
||||
|
||||
if [ -d "$REAL_TESSDATA" ] && [ -w "$REAL_TESSDATA" ]; then
|
||||
log_warn "Skipping tessdata adjustments; directory writable: $REAL_TESSDATA"
|
||||
else
|
||||
log_warn "Skipping tessdata adjustments; directory missing or not writable: $REAL_TESSDATA"
|
||||
fi
|
||||
|
||||
if [ -d /usr/share/tesseract-ocr/5/tessdata ]; then
|
||||
cp -r /usr/share/tesseract-ocr/5/tessdata/* /usr/share/tessdata || true;
|
||||
REAL_TESSDATA="/usr/share/tesseract-ocr/5/tessdata"
|
||||
log_warn "Using /usr/share/tesseract-ocr/5/tessdata as TESSDATA_PREFIX"
|
||||
elif [ -d /usr/share/tessdata ]; then
|
||||
REAL_TESSDATA="/usr/share/tessdata"
|
||||
log_warn "Using /usr/share/tessdata as TESSDATA_PREFIX"
|
||||
elif [ -d /tessdata ]; then
|
||||
REAL_TESSDATA="/tessdata"
|
||||
log_warn "Using /tessdata as TESSDATA_PREFIX"
|
||||
else
|
||||
REAL_TESSDATA=""
|
||||
log_warn "No tessdata directory found"
|
||||
fi
|
||||
|
||||
# Check if TESSERACT_LANGS environment variable is set and is not empty
|
||||
if [[ -n "$TESSERACT_LANGS" ]]; then
|
||||
# Convert comma-separated values to a space-separated list
|
||||
SPACE_SEPARATED_LANGS=$(echo $TESSERACT_LANGS | tr ',' ' ')
|
||||
pattern='^[a-zA-Z]{2,4}(_[a-zA-Z]{2,4})?$'
|
||||
# Install each language pack
|
||||
for LANG in $SPACE_SEPARATED_LANGS; do
|
||||
if [[ $LANG =~ $pattern ]]; then
|
||||
apk add --no-cache "tesseract-ocr-data-$LANG"
|
||||
else
|
||||
echo "Skipping invalid language code"
|
||||
fi
|
||||
done
|
||||
if [ -n "$REAL_TESSDATA" ]; then
|
||||
export TESSDATA_PREFIX="$REAL_TESSDATA"
|
||||
fi
|
||||
|
||||
# Ensure temp directory exists with correct permissions before running main init
|
||||
mkdir -p /tmp/stirling-pdf || true
|
||||
# === Temp dir ===
|
||||
# Ensure the temporary directory exists and has proper permissions.
|
||||
mkdir -p /tmp/stirling-pdf
|
||||
chown -R stirlingpdfuser:stirlingpdfgroup /tmp/stirling-pdf || true
|
||||
chmod -R 755 /tmp/stirling-pdf || true
|
||||
|
||||
/scripts/init-without-ocr.sh "$@"
|
||||
# === Start application ===
|
||||
# Run the main init script that handles the full startup logic.
|
||||
exec /scripts/init-without-ocr.sh
|
||||
|
||||
Reference in New Issue
Block a user