OCR fix and Mobile QR changes (#5433)

# Description of Changes
## OCR / Tesseract path handling

Makes tessDataPath resolution deterministic with priority: config >
TESSDATA_PREFIX env > default.
Updates language discovery to use runtimePathConfig.getTessDataPath()
instead of raw config value.
Ensure default OCR dir is debian based not alpine

## Mobile scanner: feature gating + new conversion settings
Adds system.mobileScannerSettings (convert-to-PDF + resolution + page
format + stretch) exposed via backend config and configurable in the
proprietary admin UI.
Enforces enableMobileScanner on the MobileScannerController endpoints
(403 when disabled).
Frontend mobile upload flow can now optionally convert received images
to PDF (pdf-lib + canvas).

## Desktop/Tauri connectivity work
Expands tauri-plugin-http permissions and enables dangerous-settings.
Adds a very comprehensive multi-stage server connection diagnostic
routine (with lots of logging).


<img width="688" height="475" alt="image"
src="https://github.com/user-attachments/assets/6f9c1aec-58c7-449b-96b0-52f25430d741"
/>


---

## Checklist

### General

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.
This commit is contained in:
Anthony Stirling
2026-01-12 11:18:37 +00:00
committed by GitHub
parent 0ae108ca11
commit d2677e64dd
20 changed files with 1478 additions and 133 deletions

View File

@@ -14,17 +14,20 @@ echo "==================================="
setup_ocr() {
echo "Setting up OCR languages..."
# Copy tessdata
mkdir -p /usr/share/tessdata
cp -rn /usr/share/tessdata-original/* /usr/share/tessdata 2>/dev/null || true
# In Alpine, tesseract uses /usr/share/tessdata
TESSDATA_DIR="/usr/share/tessdata"
if [ -d /usr/share/tesseract-ocr/4.00/tessdata ]; then
cp -r /usr/share/tesseract-ocr/4.00/tessdata/* /usr/share/tessdata 2>/dev/null || true
# Create tessdata directory
mkdir -p "$TESSDATA_DIR"
# Restore system languages from backup (Dockerfile moved them to tessdata-original)
if [ -d /usr/share/tessdata-original ]; then
echo "Restoring system tessdata from backup..."
cp -rn /usr/share/tessdata-original/* "$TESSDATA_DIR"/ 2>/dev/null || true
fi
if [ -d /usr/share/tesseract-ocr/5/tessdata ]; then
cp -r /usr/share/tesseract-ocr/5/tessdata/* /usr/share/tessdata 2>/dev/null || true
fi
# Note: If user mounted custom languages to /usr/share/tessdata, they'll be overlaid here.
# The cp -rn above won't overwrite user files, just adds missing system files.
# Install additional languages if specified
if [[ -n "$TESSERACT_LANGS" ]]; then
@@ -32,10 +35,15 @@ setup_ocr() {
pattern='^[a-zA-Z]{2,4}(_[a-zA-Z]{2,4})?$'
for LANG in $SPACE_SEPARATED_LANGS; do
if [[ $LANG =~ $pattern ]]; then
echo "Installing tesseract language: $LANG"
apk add --no-cache "tesseract-ocr-data-$LANG" 2>/dev/null || true
fi
done
fi
# Point to the consolidated location
export TESSDATA_PREFIX="$TESSDATA_DIR"
echo "Using TESSDATA_PREFIX=$TESSDATA_PREFIX"
}
# Function to setup user permissions (from init-without-ocr.sh)