Stirling-PDF/Dockerfile
Ludy 7f7d6fd1c9
feat(convert): add PDF to Video converter (FFmpeg) with MP4/WebM support (#4704)
# Description of Changes

This pull request introduces support for FFmpeg as a new external tool
in the application. It adds configuration options for FFmpeg session
limits and timeouts, updates the process execution and tool-checking
utilities to handle FFmpeg, and expands endpoint configuration to
include FFmpeg-dependent features. Corresponding unit tests have also
been added to ensure FFmpeg detection works as expected.

**FFmpeg Integration and Configuration:**

* Added FFmpeg session limit and timeout configuration options to
`ApplicationProperties`, with default values and getter methods.
[[1]](diffhunk://#diff-1c357db0a3e88cf5bedd4a5852415fadad83b8b3b9eb56e67059d8b9d8b10702R631)
[[2]](diffhunk://#diff-1c357db0a3e88cf5bedd4a5852415fadad83b8b3b9eb56e67059d8b9d8b10702R672-R675)
[[3]](diffhunk://#diff-1c357db0a3e88cf5bedd4a5852415fadad83b8b3b9eb56e67059d8b9d8b10702R702)
[[4]](diffhunk://#diff-1c357db0a3e88cf5bedd4a5852415fadad83b8b3b9eb56e67059d8b9d8b10702R743-R746)
* Updated `ProcessExecutor` to recognize FFmpeg as a process type, and
to use the new session limit and timeout configuration for FFmpeg
processes.
[[1]](diffhunk://#diff-8424a11112fff55cc28467c4d531e451a485911ed1aeb0aea772c9fa7dc3aa6aL305-R316)
[[2]](diffhunk://#diff-8424a11112fff55cc28467c4d531e451a485911ed1aeb0aea772c9fa7dc3aa6aR74-R78)
[[3]](diffhunk://#diff-8424a11112fff55cc28467c4d531e451a485911ed1aeb0aea772c9fa7dc3aa6aR133-R137)

**Tool Detection and Exception Handling:**

* Implemented `isFfmpegAvailable()` in `CheckProgramInstall` to detect
FFmpeg installation, with caching for efficiency.
[[1]](diffhunk://#diff-7b61807107c689e3824a5f8fd42c27ab072a67a5666f24445bd6895937351690R14)
[[2]](diffhunk://#diff-7b61807107c689e3824a5f8fd42c27ab072a67a5666f24445bd6895937351690R60-R78)
* Added a specific exception factory method for missing FFmpeg in
`ExceptionUtils`.

**Endpoint Configuration:**

* Registered the new `pdf-to-video` endpoint under the "Convert",
"Java", and "FFmpeg" groups, and updated the tool group logic to include
"FFmpeg".
[[1]](diffhunk://#diff-3cddb66d1cf93eeb8103ccd17cee8ed006e0c0ee006d0ee1cf42d512f177e437R265)
[[2]](diffhunk://#diff-3cddb66d1cf93eeb8103ccd17cee8ed006e0c0ee006d0ee1cf42d512f177e437R395)
[[3]](diffhunk://#diff-3cddb66d1cf93eeb8103ccd17cee8ed006e0c0ee006d0ee1cf42d512f177e437R452-R454)
[[4]](diffhunk://#diff-3cddb66d1cf93eeb8103ccd17cee8ed006e0c0ee006d0ee1cf42d512f177e437L496-R502)

**Testing Enhancements:**

* Added and updated unit tests in `CheckProgramInstallTest` to verify
FFmpeg detection, including scenarios for installed, not installed, and
caching behavior.
[[1]](diffhunk://#diff-0eaf917d935710f0f5e18f12db600be47b8439d628d65a97a3db34133231790eR29)
[[2]](diffhunk://#diff-0eaf917d935710f0f5e18f12db600be47b8439d628d65a97a3db34133231790eR38-R45)
[[3]](diffhunk://#diff-0eaf917d935710f0f5e18f12db600be47b8439d628d65a97a3db34133231790eR67-R75)
[[4]](diffhunk://#diff-0eaf917d935710f0f5e18f12db600be47b8439d628d65a97a3db34133231790eR222-R262)

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.
2025-10-30 22:54:33 +00:00

103 lines
4.3 KiB
Docker

# Main stage
FROM alpine:3.22.2@sha256:4b7ce07002c69e8f3d704a9c5d6fd3053be500b7f1c69fc0d80990c2ad8dd412
# Copy necessary files
COPY scripts /scripts
COPY app/core/src/main/resources/static/fonts/*.ttf /usr/share/fonts/opentype/noto/
COPY app/core/build/libs/*.jar app.jar
ARG VERSION_TAG
LABEL org.opencontainers.image.title="Stirling-PDF"
LABEL org.opencontainers.image.description="A powerful locally hosted web-based PDF manipulation tool supporting 50+ operations including merging, splitting, conversion, OCR, watermarking, and more."
LABEL org.opencontainers.image.source="https://github.com/Stirling-Tools/Stirling-PDF"
LABEL org.opencontainers.image.licenses="MIT"
LABEL org.opencontainers.image.vendor="Stirling-Tools"
LABEL org.opencontainers.image.url="https://www.stirlingpdf.com"
LABEL org.opencontainers.image.documentation="https://docs.stirlingpdf.com"
LABEL maintainer="Stirling-Tools"
LABEL org.opencontainers.image.authors="Stirling-Tools"
LABEL org.opencontainers.image.version="${VERSION_TAG}"
LABEL org.opencontainers.image.keywords="PDF, manipulation, merge, split, convert, OCR, watermark"
# Set Environment Variables
ENV DISABLE_ADDITIONAL_FEATURES=true \
VERSION_TAG=$VERSION_TAG \
JAVA_BASE_OPTS="-XX:+UnlockExperimentalVMOptions -XX:MaxRAMPercentage=75 -XX:InitiatingHeapOccupancyPercent=20 -XX:+G1PeriodicGCInvokesConcurrent -XX:G1PeriodicGCInterval=10000 -XX:+UseStringDeduplication -XX:G1PeriodicGCSystemLoadThreshold=70" \
JAVA_CUSTOM_OPTS="" \
HOME=/home/stirlingpdfuser \
PUID=1000 \
PGID=1000 \
UMASK=022 \
PYTHONPATH=/usr/lib/libreoffice/program:/opt/venv/lib/python3.12/site-packages \
UNO_PATH=/usr/lib/libreoffice/program \
URE_BOOTSTRAP=file:///usr/lib/libreoffice/program/fundamentalrc \
PATH=$PATH:/opt/venv/bin \
STIRLING_TEMPFILES_DIRECTORY=/tmp/stirling-pdf \
TMPDIR=/tmp/stirling-pdf \
TEMP=/tmp/stirling-pdf \
TMP=/tmp/stirling-pdf
# JDK for app
RUN echo "@main https://dl-cdn.alpinelinux.org/alpine/edge/main" | tee -a /etc/apk/repositories && \
echo "@community https://dl-cdn.alpinelinux.org/alpine/edge/community" | tee -a /etc/apk/repositories && \
echo "@testing https://dl-cdn.alpinelinux.org/alpine/edge/testing" | tee -a /etc/apk/repositories && \
apk upgrade --no-cache -a && \
apk add --no-cache \
ca-certificates \
tzdata \
tini \
bash \
curl \
shadow \
su-exec \
openssl \
openssl-dev \
openjdk21-jre \
ffmpeg \
# Doc conversion
gcompat \
libc6-compat \
libreoffice \
# pdftohtml
poppler-utils \
# OCR MY PDF (unpaper for descew and other advanced features)
tesseract-ocr-data-eng \
tesseract-ocr-data-chi_sim \
tesseract-ocr-data-deu \
tesseract-ocr-data-fra \
tesseract-ocr-data-por \
unpaper \
# CV
py3-opencv \
python3 \
ocrmypdf \
py3-pip \
py3-pillow@testing \
py3-pdf2image@testing \
# URW Base 35 fonts for better PDF rendering
font-urw-base35 && \
python3 -m venv /opt/venv && \
/opt/venv/bin/pip install --no-cache-dir --upgrade pip setuptools && \
/opt/venv/bin/pip install --no-cache-dir --upgrade unoserver weasyprint && \
ln -s /usr/lib/libreoffice/program/uno.py /opt/venv/lib/python3.12/site-packages/ && \
ln -s /usr/lib/libreoffice/program/unohelper.py /opt/venv/lib/python3.12/site-packages/ && \
ln -s /usr/lib/libreoffice/program /opt/venv/lib/python3.12/site-packages/LibreOffice && \
mv /usr/share/tessdata /usr/share/tessdata-original && \
mkdir -p $HOME /configs /logs /customFiles /pipeline/watchedFolders /pipeline/finishedFolders /tmp/stirling-pdf && \
# Configure URW Base 35 fonts
ln -s /usr/share/fontconfig/conf.avail/69-urw-*.conf /etc/fonts/conf.d/ && \
fc-cache -f -v && \
chmod +x /scripts/* && \
# User permissions
addgroup -S stirlingpdfgroup && adduser -S stirlingpdfuser -G stirlingpdfgroup && \
chown -R stirlingpdfuser:stirlingpdfgroup $HOME /scripts /usr/share/fonts/opentype/noto /configs /customFiles /pipeline /tmp/stirling-pdf && \
chown stirlingpdfuser:stirlingpdfgroup /app.jar
EXPOSE 8080/tcp
# Set user and run command
ENTRYPOINT ["tini", "--", "/scripts/init.sh"]
CMD ["sh", "-c", "java -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/tmp/stirling-pdf -jar /app.jar & /opt/venv/bin/unoserver --port 2003 --interface 127.0.0.1"]