Stirling-PDF/testing/test.sh
Ludy 886f9b379e
feat(docker-runtime): unified Debian-based images, dynamic path resolution & enhanced UNO/LibreOffice handling (#4880)
# Description of Changes

### What was changed

This PR introduces a major refinement to the Docker runtime, system path
resolution, conversion tooling, and integration logic across the
codebase. Key improvements include:

- Migration of **Dockerfile**, **Dockerfile.fat** to a unified
Debian-based environment.
- Introduction of **RuntimePathConfig** enhancements to dynamically
resolve:
  - `weasyprint`, `unoconvert`, `calibre`, `ocrmypdf`, `soffice`
  - Tesseract `tessdata` paths with Docker-aware defaults.
- Support for **UNO server (unoserver/unoconvert)** as primary document
converter with automatic fallback to `soffice`.
- Isolation of Python environments for WeasyPrint and UNO tooling.
- Updated controllers and services to correctly inject
`RuntimePathConfig`.
- Improved process execution logic in converters and OCR handling.
- Major updates to `init.sh` and `init-without-ocr.sh`:
  - Unified environment initialization
  - Proper UID/GID remapping
  - Safer permissions handling
  - Automatic Tesseract path detection
  - Reliable startup of headless LibreOffice + Xvfb + UNO server
- Full test suite updates:
  - Adaptation to new conversion paths
  - Mocking of UNO and LibreOffice commands
  - More robust Docker test logic
- Updated example docker-compose files referencing GHCR test images.
- Expanded configuration schema for new operations paths.

### Why the change was made

These changes address long-standing issues around:

- Inconsistent or missing binary paths between image variants.
- Reduced reliability of document conversions (UNO vs. soffice).
- Lack of uniform runtime initialization across Docker images.
- Repetitive environment setup logic split across multiple scripts.
- Fragile test scenarios tied to Alpine-based images.

Switching to a unified Debian-based runtime significantly improves:

- Compatibility with LibreOffice, Calibre, WebEngine and graphics stack.
- UNO stability for document conversions.
- Tesseract deterministic behavior.
- Debuggability and reliability of CI/CD Docker-based tests.

The improvements to `RuntimePathConfig` ensure all system binaries are
fully configurable and correctly detected at runtime.

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [x] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.
2025-11-24 23:07:54 +00:00

479 lines
18 KiB
Bash

#!/bin/bash
# Find project root by locating build.gradle
find_root() {
local dir="$PWD"
while [[ "$dir" != "/" ]]; do
if [[ -f "$dir/build.gradle" ]]; then
echo "$dir"
return 0
fi
dir="$(dirname "$dir")"
done
echo "Error: build.gradle not found" >&2
exit 1
}
PROJECT_ROOT=$(find_root)
# Function to check application readiness via HTTP instead of Docker's health status
check_health() {
local container_name=$1 # real container name
local compose_file=$2
local timeout=80 # total timeout in seconds
local interval=3 # poll interval in seconds
local end=$((SECONDS + timeout))
local last_code="000"
echo "Waiting for $container_name to become reachable on http://localhost:8080/ (timeout ${timeout}s)..."
while [ $SECONDS -lt $end ]; do
# Optional: check if container is running at all (nice for debugging)
if ! docker ps --format '{{.Names}}' | grep -Fxq "$container_name"; then
echo " Container $container_name not running yet (still waiting)..."
fi
# Try simple HTTP GET on the root page
last_code=$(curl -s -o /dev/null -w '%{http_code}' "http://localhost:8080/") || last_code="000"
# Treat any 2xx or 3xx as "ready"
if [ "$last_code" -ge 200 ] && [ "$last_code" -lt 400 ]; then
echo "$container_name is reachable over HTTP (status $last_code)."
echo "Printing logs for $container_name:"
docker logs "$container_name" || true
return 0
fi
echo " Still waiting for HTTP readiness, current status: $last_code"
sleep "$interval"
done
echo "$container_name did not become HTTP-ready within ${timeout}s (last HTTP status: $last_code)."
# For extra debugging: show Docker health status, but DO NOT depend on it
local docker_health
docker_health=$(docker inspect --format='{{if .State.Health}}{{.State.Health.Status}}{{else}}(no healthcheck){{end}}' "$container_name" 2>/dev/null || echo "inspect failed")
echo "Docker-reported health status for $container_name: $docker_health"
echo "Printing logs for $container_name:"
docker logs "$container_name" || true
return 1
}
# Function to capture file list from a Docker container
capture_file_list() {
local container_name=$1
local output_file=$2
echo "Capturing file list from $container_name..."
# Get all files in one command, output directly from Docker to avoid path issues
# Skip proc, sys, dev, and the specified LibreOffice config directory
# Also skip PDFBox and LibreOffice temporary files
docker exec "$container_name" sh -c "find / -type f \
-not -path '*/proc/*' \
-not -path '*/sys/*' \
-not -path '*/dev/*' \
-not -path '/config/*' \
-not -path '/logs/*' \
-not -path '*/home/stirlingpdfuser/.config/libreoffice/*' \
-not -path '*/home/stirlingpdfuser/.pdfbox.cache' \
-not -path '*/tmp/stirling-pdf/PDFBox*' \
-not -path '*/tmp/stirling-pdf/hsperfdata_stirlingpdfuser/*' \
-not -path '*/tmp/hsperfdata_stirlingpdfuser/*' \
-not -path '*/tmp/stirling-pdf/lu*' \
-not -path '*/tmp/stirling-pdf/tmp*' \
2>/dev/null | xargs -I{} sh -c 'stat -c \"%n %s %Y\" \"{}\" 2>/dev/null || true' | sort" > "$output_file"
# Check if the output file has content
if [ ! -s "$output_file" ]; then
echo "WARNING: Failed to capture file list or container returned empty list"
echo "Trying alternative approach..."
# Alternative simpler approach - just get paths as a fallback
docker exec "$container_name" sh -c "find / -type f \
-not -path '*/proc/*' \
-not -path '*/sys/*' \
-not -path '*/dev/*' \
-not -path '/config/*' \
-not -path '/logs/*' \
-not -path '*/home/stirlingpdfuser/.config/libreoffice/*' \
-not -path '*/home/stirlingpdfuser/.pdfbox.cache' \
-not -path '*/tmp/PDFBox*' \
-not -path '*/tmp/hsperfdata_stirlingpdfuser/*' \
-not -path '*/tmp/stirling-pdf/hsperfdata_stirlingpdfuser/*' \
-not -path '*/tmp/lu*' \
-not -path '*/tmp/tmp*' \
2>/dev/null | sort" > "$output_file"
if [ ! -s "$output_file" ]; then
echo "ERROR: All attempts to capture file list failed"
# Create a dummy entry to prevent diff errors
echo "NO_FILES_FOUND 0 0" > "$output_file"
fi
fi
echo "File list captured to $output_file"
}
# Function to compare before and after file lists
compare_file_lists() {
local before_file=$1
local after_file=$2
local diff_file=$3
local container_name=$4 # Added container_name parameter
echo "Comparing file lists..."
# Check if files exist and have content
if [ ! -s "$before_file" ] || [ ! -s "$after_file" ]; then
echo "WARNING: One or both file lists are empty."
if [ ! -s "$before_file" ]; then echo "Before file is empty: $before_file"; fi
if [ ! -s "$after_file" ]; then echo "After file is empty: $after_file"; fi
# Create empty diff file
> "$diff_file"
# Check if we at least have the after file to look for temp files
if [ -s "$after_file" ]; then
echo "Checking for temp files in the after snapshot..."
grep -i "tmp\|temp" "$after_file" > "${diff_file}.tmp"
if [ -s "${diff_file}.tmp" ]; then
echo "WARNING: Temporary files found:"
cat "${diff_file}.tmp"
echo "Printing docker logs due to temporary file detection:"
docker logs "$container_name" # Print logs when temp files are found
return 1
else
echo "No temporary files found in the after snapshot."
fi
fi
return 0
fi
# Both files exist and have content, proceed with diff
diff "$before_file" "$after_file" > "$diff_file"
if [ -s "$diff_file" ]; then
echo "Detected changes in files:"
cat "$diff_file"
# Extract only added files (lines starting with ">")
grep "^>" "$diff_file" > "${diff_file}.added" || true
if [ -s "${diff_file}.added" ]; then
echo "New files created during test:"
cat "${diff_file}.added" | sed 's/^> //'
# Check for tmp files
grep -i "tmp\|temp" "${diff_file}.added" > "${diff_file}.tmp" || true
if [ -s "${diff_file}.tmp" ]; then
echo "WARNING: Temporary files detected:"
cat "${diff_file}.tmp"
echo "Printing docker logs due to temporary file detection:"
docker logs "$container_name" # Print logs when temp files are found
return 1
fi
fi
# Extract only removed files (lines starting with "<")
grep "^<" "$diff_file" > "${diff_file}.removed" || true
if [ -s "${diff_file}.removed" ]; then
echo "Files removed during test:"
cat "${diff_file}.removed" | sed 's/^< //'
fi
else
echo "No file changes detected during test."
fi
return 0
}
# Get the expected version from Gradle once
get_expected_version() {
./gradlew printVersion --quiet | tail -1
}
# Function to verify the application version
verify_app_version() {
local service_name=$1
local base_url=$2
echo "Checking version for $service_name (expecting $EXPECTED_VERSION)..."
# Try to access the homepage and extract the version
local response
response=$(curl -s "$base_url")
# Extract version from pixel tracking tag
local actual_version
actual_version=$(echo "$response" | grep -o 'appVersion=[0-9.]*' | head -1 | sed 's/appVersion=//')
# If we couldn't find the version in the pixel tag, try other approaches
if [ -z "$actual_version" ]; then
# Check for "App Version:" format
if echo "$response" | grep -q "App Version:"; then
actual_version=$(echo "$response" | grep -o "App Version: [0-9.]*" | sed 's/App Version: //')
else
echo "❌ Version verification failed: Could not find version information"
return 1
fi
fi
# Check if the extracted version matches expected version
if [ "$actual_version" = "$EXPECTED_VERSION" ]; then
echo "✅ Version verification passed: $actual_version"
return 0
elif [ "$actual_version" = "0.0.0" ]; then
echo "❌ Version verification failed: Found placeholder version 0.0.0"
return 1
else
echo "❌ Version verification failed: Found $actual_version, expected $EXPECTED_VERSION"
return 1
fi
}
# Function to test a Docker Compose configuration
test_compose() {
local compose_file=$1
local test_name=$2
local status=0
echo "Testing ${compose_file} configuration..."
# Start up the Docker Compose service
docker-compose -f "$compose_file" up -d
# Wait a moment for containers to appear
sleep 3
local container_name
container_name=$(docker-compose -f "$compose_file" ps --format '{{.Names}}' --filter "status=running" | head -n1)
if [[ -z "$container_name" ]]; then
echo "ERROR: No running container found for ${compose_file}"
docker-compose -f "$compose_file" ps
return 1
fi
echo "Started container: $container_name"
# Wait for the service to become healthy (HTTP-based)
if check_health "$container_name" "$compose_file"; then
echo "${test_name} test passed."
else
echo "${test_name} test failed."
status=1
fi
return $status
}
# Keep track of which tests passed and failed
declare -a passed_tests
declare -a failed_tests
run_tests() {
local test_name=$1
local compose_file=$2
if test_compose "$compose_file" "$test_name"; then
passed_tests+=("$test_name")
else
failed_tests+=("$test_name")
fi
}
# Main testing routine
main() {
SECONDS=0
cd "$PROJECT_ROOT"
export DOCKER_CLI_EXPERIMENTAL=enabled
export COMPOSE_DOCKER_CLI_BUILD=0
# ==================================================================
# 1. Ultra-Lite (no additional features)
# ==================================================================
export DISABLE_ADDITIONAL_FEATURES=true
if ! ./gradlew clean build; then
echo "Gradle build failed with security disabled, exiting script."
exit 1
fi
# Get expected version after the build to ensure version.properties is created
echo "Getting expected version from Gradle..."
EXPECTED_VERSION=$(get_expected_version)
echo "Expected version: $EXPECTED_VERSION"
# Build Ultra-Lite image (GHCR tag, matching docker-compose-latest-ultra-lite.yml)
docker build --build-arg VERSION_TAG=alpha \
-t docker.stirlingpdf.com/stirlingtools/stirling-pdf:ultra-lite \
-f ./Dockerfile.ultra-lite .
# Test Ultra-Lite configuration
run_tests "Stirling-PDF-Ultra-Lite" "./exampleYmlFiles/docker-compose-latest-ultra-lite.yml"
echo "Testing webpage accessibility..."
cd "testing"
if ./test_webpages.sh -f webpage_urls.txt -b http://localhost:8080; then
passed_tests+=("Webpage-Accessibility-lite")
else
failed_tests+=("Webpage-Accessibility-lite")
echo "Webpage accessibility lite tests failed"
fi
cd "$PROJECT_ROOT"
echo "Testing version verification..."
if verify_app_version "Stirling-PDF-Ultra-Lite" "http://localhost:8080"; then
passed_tests+=("Stirling-PDF-Ultra-Lite-Version-Check")
echo "Version verification passed for Stirling-PDF-Ultra-Lite"
else
failed_tests+=("Stirling-PDF-Ultra-Lite-Version-Check")
echo "Version verification failed for Stirling-PDF-Ultra-Lite"
fi
docker-compose -f "./exampleYmlFiles/docker-compose-latest-ultra-lite.yml" down -v
# ==================================================================
# 2. Full Fat + Security
# ==================================================================
export DISABLE_ADDITIONAL_FEATURES=false
if ! ./gradlew clean build; then
echo "Gradle build failed with security enabled, exiting script."
exit 1
fi
echo "Getting expected version from Gradle (security enabled)..."
EXPECTED_VERSION=$(get_expected_version)
echo "Expected version with security enabled: $EXPECTED_VERSION"
# Build Fat (Security) image for GHCR tag used in all 'fat' compose files
docker build --no-cache --pull --build-arg VERSION_TAG=alpha \
-t docker.stirlingpdf.com/stirlingtools/stirling-pdf:fat \
-f ./Dockerfile.fat .
# Test fat + security compose
run_tests "Stirling-PDF-Security-Fat" "./exampleYmlFiles/docker-compose-latest-fat-security.yml"
echo "Testing webpage accessibility..."
cd "testing"
if ./test_webpages.sh -f webpage_urls_full.txt -b http://localhost:8080; then
passed_tests+=("Webpage-Accessibility-full")
else
failed_tests+=("Webpage-Accessibility-full")
echo "Webpage accessibility full tests failed"
fi
cd "$PROJECT_ROOT"
echo "Testing version verification..."
if verify_app_version "Stirling-PDF-Security-Fat" "http://localhost:8080"; then
passed_tests+=("Stirling-PDF-Security-Fat-Version-Check")
echo "Version verification passed for Stirling-PDF-Security-Fat"
else
failed_tests+=("Stirling-PDF-Security-Fat-Version-Check")
echo "Version verification failed for Stirling-PDF-Security-Fat"
fi
docker-compose -f "./exampleYmlFiles/docker-compose-latest-fat-security.yml" down -v
# ==================================================================
# 3. Regression test with login (test_cicd.yml)
# ==================================================================
run_tests "Stirling-PDF-Security-Fat-with-login" "./exampleYmlFiles/test_cicd.yml"
# Only run behave tests if the container started successfully
if [[ " ${passed_tests[*]} " =~ "Stirling-PDF-Security-Fat-with-login" ]]; then
CONTAINER_NAME=$(docker-compose -f "./exampleYmlFiles/test_cicd.yml" ps --format '{{.Names}}' --filter "status=running" | head -n1)
SNAPSHOT_DIR="$PROJECT_ROOT/testing/file_snapshots"
mkdir -p "$SNAPSHOT_DIR"
BEFORE_FILE="$SNAPSHOT_DIR/files_before_behave.txt"
AFTER_FILE="$SNAPSHOT_DIR/files_after_behave.txt"
DIFF_FILE="$SNAPSHOT_DIR/files_diff.txt"
capture_file_list "$CONTAINER_NAME" "$BEFORE_FILE"
cd "testing/cucumber"
if python -m behave; then
echo "Waiting 5 seconds for any file operations to complete..."
sleep 5
cd "$PROJECT_ROOT"
capture_file_list "$CONTAINER_NAME" "$AFTER_FILE"
if compare_file_lists "$BEFORE_FILE" "$AFTER_FILE" "$DIFF_FILE" "$CONTAINER_NAME"; then
echo "No unexpected temporary files found."
passed_tests+=("Stirling-PDF-Regression $CONTAINER_NAME")
else
echo "WARNING: Unexpected temporary files detected after behave tests!"
failed_tests+=("Stirling-PDF-Regression-Temp-Files")
fi
passed_tests+=("Stirling-PDF-Regression $CONTAINER_NAME")
else
failed_tests+=("Stirling-PDF-Regression $CONTAINER_NAME")
echo "Printing docker logs of failed regression"
docker logs "$CONTAINER_NAME"
echo "Printed docker logs of failed regression"
echo "Waiting 10 seconds before capturing file list..."
sleep 10
cd "$PROJECT_ROOT"
capture_file_list "$CONTAINER_NAME" "$AFTER_FILE"
compare_file_lists "$BEFORE_FILE" "$AFTER_FILE" "$DIFF_FILE" "$CONTAINER_NAME"
fi
fi
docker-compose -f "./exampleYmlFiles/test_cicd.yml" down -v
# ==================================================================
# 4. Disabled Endpoints Test
# ==================================================================
run_tests "Stirling-PDF-Fat-Disable-Endpoints" "./exampleYmlFiles/docker-compose-latest-fat-endpoints-disabled.yml"
echo "Testing disabled endpoints..."
if ./testing/test_disabledEndpoints.sh -f ./testing/endpoints.txt -b http://localhost:8080; then
passed_tests+=("Disabled-Endpoints")
else
failed_tests+=("Disabled-Endpoints")
echo "Disabled Endpoints tests failed"
fi
echo "Testing version verification..."
if verify_app_version "Stirling-PDF-Fat-Disable-Endpoints" "http://localhost:8080"; then
passed_tests+=("Stirling-PDF-Fat-Disable-Endpoints-Version-Check")
echo "Version verification passed for Stirling-PDF-Fat-Disable-Endpoints"
else
failed_tests+=("Stirling-PDF-Fat-Disable-Endpoints-Version-Check")
echo "Version verification failed for Stirling-PDF-Fat-Disable-Endpoints"
fi
docker-compose -f "./exampleYmlFiles/docker-compose-latest-fat-endpoints-disabled.yml" down -v
# ==================================================================
# Final Report
# ==================================================================
echo "All tests completed in $SECONDS seconds."
if [ ${#passed_tests[@]} -ne 0 ]; then
echo "Passed tests:"
for test in "${passed_tests[@]}"; do
echo -e "\e[32m$test\e[0m"
done
fi
if [ ${#failed_tests[@]} -ne 0 ]; then
echo "Failed tests:"
for test in "${failed_tests[@]}"; do
echo -e "\e[31m$test\e[0m"
done
fi
if [ ${#failed_tests[@]} -ne 0 ]; then
echo "Some tests failed."
exit 1
else
echo "All tests passed successfully."
exit 0
fi
}
main