Stirling-PDF/app
Balázs Szücs bdd8d2e6d4
feat(pdf-conversion): add support for PDF/A-3b, PDF/X formats improve current PDF/A conversion (#4844)
# Description of Changes

TLDR
- Updated `PdfToPdfARequest` to include PDF/X in supported output
formats
- Expanded input handling and model validation for PDF/A and PDF/X
- Added Ghostscript as a preferred backend for PDF/A and PDF/X
conversions
- Implemented PDF/X-specific conversion logic with detailed validation
- Updated UI templates to separate PDF/A and PDF/X format options
- Updated error handling and warnings during conversion processes


This PR replaces the PDF/A conversion system with Ghostscript as the
primary method, which less warning prone output compared to the previous
LibreOffice approach. It also adds PDF/X format support for print
production workflows.


### Better PDF/A Compliance
- Ghostscript produces standards-compliant PDF/A with fewer validation
errors
- Previous LibreOffice method generates files with structural errors and
validation warnings
- Automatic fallback to PDFBox/LibreOffice if Ghostscript unavailable
- Built-in validation using PDFBox Preflight catches issues early

### New PDF/X Support
Print production workflows now supported with PDF/X-1, PDF/X-3, and
PDF/X-4 formats for professional printing requirements.

### More Reliable Output
- Deterministic conversion results
- Better font embedding and subsetting
- Proper ICC profile and color space handling
- Improved resource cleanup prevents memory leaks

### Ghostscript Integration
- `buildGhostscriptCommand()` / `buildGhostscriptCommandX()` -
Constructs CLI arguments
- `convertWithGhostscript()` / `convertWithGhostscriptX()` - Executes
conversion
- `isGhostscriptAvailable()` - Checks installation
- `prepareColorProfiles()` - Sets up ICC profiles
- `createPdfaDefFile()` - Generates PostScript definitions

### Conversion Flow
- `handlePdfAConversion()` - Routes PDF/A with Ghostscript primary,
PDFBox fallback
- `handlePdfXConversion()` - Routes PDF/X using Ghostscript
- `convertWithPdfBoxMethod()` - Refactored fallback method

### Validation
- `validatePdfaOutput()` - Validates using PDFBox Preflight
- `validateAndWarnPdfA()` - Logs warnings instead of failing
- `buildPreflightErrorMessage()` - Formats detailed errors

### Font Handling
Updated `embedMissingFonts()` prevents stream exhaustion by loading font
bytes once and creating fresh InputStreams for multiple load attempts.

### Utilities
- `findUnembeddedFontNames()` - Identifies unembedded fonts
- `deleteQuietly()` - Recursively deletes temp directories
- `sanitizePdfA()` - Removes incompatible elements
- `removeElementsForPdfA()` - Removes Optional Content and transparency
- `mergeAndAddXmpMetadata()` - Handles XMP metadata
- `preProcessHighlights()` - Pre-processes annotations
- Transparency detection: `isTransparencyGroup()`,
`hasTransparentImages()`, `detectTransparentXObjects()`



### PDF/A
- PDF/A-1b: Strict compliance
- PDF/A-2b: Extended features (default)
- PDF/A-3b: Embedded files support

### PDF/X
- PDF/X-1: Standard print exchange
- PDF/X-3: Color-managed with ICC profiles
- PDF/X-4: Transparency support

As mentioned greatest benefit is the new Ghostscript conversion is able
to deliver fewer warning/zero error PDF/A files compared to the
LibreOffice. Sometimes however, both succeed without warnings. Here are
some samples:

<img width="1876" height="675" alt="image"
src="https://github.com/user-attachments/assets/ee71c2f3-e5ee-45ec-ba61-8d0ffc53b386"
/>
<img width="1876" height="675" alt="image"
src="https://github.com/user-attachments/assets/d620402b-cced-47b2-808d-01bde80eceb2"
/>
<img width="1876" height="675" alt="image"
src="https://github.com/user-attachments/assets/e3052d23-883b-43fc-9953-603067bee8bf"
/>
<img width="1876" height="675" alt="image"
src="https://github.com/user-attachments/assets/13251ab9-c449-4c4a-a326-521ef1929ad2"
/>

There is also some size difference, (not sure why) but generally that
also favors Ghostscript:

<img width="978" height="340" alt="image"
src="https://github.com/user-attachments/assets/5ccf4ea2-c6ef-4751-abd0-5b8445c90861"
/>


### Front-end

<img width="978" height="632" alt="image"
src="https://github.com/user-attachments/assets/74789d20-fb79-48d6-a35b-19f519a9f898"
/>


<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [X] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [X] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [X] I have performed a self-review of my own code
- [X] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [X] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [X] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
2025-11-17 14:38:28 +00:00
..
common build(deps): bump org.springdoc:springdoc-openapi-starter-webmvc-ui from 2.8.13 to 2.8.14 (#4855) 2025-11-17 12:49:46 +00:00
core feat(pdf-conversion): add support for PDF/A-3b, PDF/X formats improve current PDF/A conversion (#4844) 2025-11-17 14:38:28 +00:00
proprietary refactor(common, core, proprietary): migrate boxed Booleans to primitive booleans and adopt is* accessors to reduce null checks/NPE risk (#4153) 2025-11-11 17:16:48 +00:00
allowed-licenses.json feat(cbr-to-pdf,pdf-to-cbr): add PDF to/from CBR conversion with ebook optimization option (#4581) 2025-10-04 11:15:23 +01:00