From 69b035c6e143f9eef2b0daf7d4c61369e0f39453 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bal=C3=A1zs=20Sz=C3=BCcs?= <127139797+balazs-szucs@users.noreply.github.com> Date: Tue, 23 Dec 2025 23:29:30 +0100 Subject: [PATCH] [V2] feat(security): add PDF standards verification feature using veraPDF (#4874) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit # Description of Changes - Implemented `PDFVerificationRequest` and `PDFVerificationResult` models for validation requests and responses - Developed `VeraPDFService` to validate PDFs against specific or auto-detected standards - Added `VerifyPDFController` with an endpoint for PDF verification - Integrated veraPDF dependencies into project build file - Deprecated unused `/verify-pdf` form in `SecurityWebController` - Updated `EndpointConfiguration` to include the new `verify-pdf` endpoint This PR introduces a PDF standards verification feature to Stirling-PDF, powered by the industry-standard veraPDF validation library. This feature enables users to validate PDF files against multiple PDF standards including PDF/A (archival), PDF/UA (accessibility), and WTPDF standards. ### 1. PDF Standards Verification Endpoint - New API Endpoint: `/api/v1/security/verify-pdf` - Validates PDF files against multiple standards: - PDF/A (1b, 2a, 2b, 2u, 3a, 3b, 3u, 4, 4e, 4f) - Archival standards - PDF/UA-1 and PDF/UA-2 - Universal Accessibility standards - WTPDF - Well-Tagged PDF standards - Auto-detection: Automatically detects and validates all standards declared in the PDF's XMP metadata ### 2. Validation Results The verification returns detailed JSON results including: - Compliance status: Whether the PDF meets the standard requirements - Declared vs validated standards: Shows what the PDF claims to be vs what it actually is - Categorized issues: - Errors: Critical compliance failures that prevent certification - Warnings: Non-critical issues and recommendations - Detailed issue information: - Rule IDs from the specification - Descriptive error messages - Location within the PDF where the issue occurs - Specification references (clause numbers, test numbers) ### 3. Detect Issue Classification Implements intelligent classification of validation issues: - Errors: Issues that prevent standard compliance (font problems, color space issues, structural problems) - Warnings: Recommended but not required elements (metadata recommendations, optional features) - Classification based on: - Rule ID patterns - Clause number prefixes - Message content analysis ### New Files Added #### Controllers - VerifyPDFController.java: REST API controller handling PDF verification requests - Handles multipart file uploads - Supports both single-standard and auto-detection modes - Comprehensive error handling for encrypted PDFs, parsing errors, and validation failures #### Models - PDFVerificationRequest.java: Request model for verification API - Extends standard PDFFile model - Optional `standard` parameter for manual standard selection - PDFVerificationResult.java: Response model containing validation results - Includes standard information and validation profile details - Separate lists for errors and warnings - Nested `ValidationIssue` class for detailed issue reporting #### Services - VeraPDFService.java: Core service implementing veraPDF integration - Initializes veraPDF Greenfield engine - Extracts declared PDF/A standards from XMP metadata - Performs validation against specified or detected standards - Converts veraPDF results to application-specific format - Implements smart issue classification logic ### Endpoint Configuration Updates #### EndpointConfiguration.java - Added `verify-pdf` to the Security group - Added `verify-pdf` to the Java group (no external tools required) - Created new veraPDF dependency group for endpoint availability tracking - Updated `isToolGroup()` method to recognize veraPDF as a tool dependency ### Supported Standards #### PDF/A (Archival) - PDF/A-1 (a, b): ISO 19005-1:2005 - PDF/A-2 (a, b, u): ISO 19005-2:2011 - PDF/A-3 (a, b, u): ISO 19005-3:2012 - PDF/A-4 (standard, e, f): ISO 19005-4:2020 #### PDF/UA (Universal Accessibility) - PDF/UA-1: ISO 14289-1:2014 - PDF/UA-2: ISO 14289-2 (latest) #### WTPDF (Well-Tagged PDF) - WTPDF 1.0: Tagged PDF for accessibility and structure ### Security Considerations The following test scenarios should be validated: 1. Valid PDF/A documents (should return compliant) 2. Non-compliant PDF/A documents (should return errors) 3. PDFs without PDF/A declaration (should detect and report) 4. PDF/UA documents (should validate accessibility) 5. Encrypted PDFs (should return appropriate error) 6. Mixed standards (PDF/A + PDF/UA) (should validate both) 7. Empty standard parameter (should auto-detect) 8. Invalid standard parameter (should return error) ### API Usage Examples ```bash curl -X POST http://localhost:8080/api/v1/security/verify-pdf \ -F "fileInput=@document.pdf" ``` ### Example Response ```json [ { "standard": "3b", "standardName": "PDF/A-ISO 19005-3:2012B compliant", "validationProfile": "3b", "validationProfileName": "PDF/A-ISO 19005-3:2012B", "complianceSummary": "PDF/A-ISO 19005-3:2012B compliant", "declaredPdfa": true, "compliant": true, "totalFailures": 0, "totalWarnings": 0, "failures": [], "warnings": [] } ] ``` ```json [ { "standard": "2b", "standardName": "PDF/A-ISO 19005-2:2011B with errors", "validationProfile": "2b", "validationProfileName": "PDF/A-ISO 19005-2:2011B", "complianceSummary": "PDF/A-ISO 19005-2:2011B with errors", "declaredPdfa": true, "compliant": false, "totalFailures": 2, "totalWarnings": 0, "failures": [ { "ruleId": "RuleId [specification=ISO 19005-2:2011, clause=6.2.11.4.1, testNumber=1]", "message": "The font programs for all fonts used for rendering within a conforming file shall be embedded within that file, as defined in ISO 32000-1:2008, 9.9", "location": "Location [level=CosDocument, context=root/document[0]/pages[0](3 0 obj PDPage)/contentStream[0](105 0 obj PDContentStream)/operators[60]/font[0](ArialMT)]", "specification": "ISO 19005-2:2011", "clause": "6.2.11.4.1", "testNumber": "1" }, { "ruleId": "RuleId [specification=ISO 19005-2:2011, clause=6.3.2, testNumber=1]", "message": "Except for annotation dictionaries whose Subtype value is Popup, all annotation dictionaries shall contain the F key", "location": "Location [level=CosDocument, context=root/document[0]/pages[0](3 0 obj PDPage)/annots[4](107 0 obj PDLinkAnnot)]", "specification": "ISO 19005-2:2011", "clause": "6.3.2", "testNumber": "1" } ], "warnings": [] } ] ``` --- ## Checklist ### General - [ ] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [ ] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [ ] I have performed a self-review of my own code - [ ] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### Translations (if applicable) - [ ] I ran [`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md) ### UI Changes (if applicable) - [ ] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [ ] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details. --------- Signed-off-by: Balázs Szücs --- app/allowed-licenses.json | 16 + .../SPDF/config/EndpointConfiguration.java | 10 +- app/core/build.gradle | 7 + .../api/security/VerifyPDFController.java | 91 +++++ .../api/security/PDFVerificationRequest.java | 10 + .../api/security/PDFVerificationResult.java | 48 +++ .../software/SPDF/service/VeraPDFService.java | 319 ++++++++++++++++++ 7 files changed, 500 insertions(+), 1 deletion(-) create mode 100644 app/core/src/main/java/stirling/software/SPDF/controller/api/security/VerifyPDFController.java create mode 100644 app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationRequest.java create mode 100644 app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationResult.java create mode 100644 app/core/src/main/java/stirling/software/SPDF/service/VeraPDFService.java diff --git a/app/allowed-licenses.json b/app/allowed-licenses.json index 830ae037a..315c6bb18 100644 --- a/app/allowed-licenses.json +++ b/app/allowed-licenses.json @@ -96,6 +96,22 @@ "moduleName": ".*", "moduleLicense": "MPL 2.0" }, + { + "moduleName": ".*", + "moduleLicense": "Mozilla Public License, Version 2.0" + }, + { + "moduleName": ".*", + "moduleLicense": "Mozilla Public License 2.0 (MPL-2.0)" + }, + { + "moduleName": ".*", + "moduleLicense": "CDDL+GPL License" + }, + { + "moduleName": ".*", + "moduleLicense": "BSD" + }, { "moduleName": ".*", "moduleLicense": "UnboundID SCIM2 SDK Free Use License" diff --git a/app/common/src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java b/app/common/src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java index 80331405d..bffd00fda 100644 --- a/app/common/src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java +++ b/app/common/src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java @@ -353,6 +353,7 @@ public class EndpointConfiguration { addEndpointToGroup("Security", "auto-redact"); addEndpointToGroup("Security", "redact"); addEndpointToGroup("Security", "validate-signature"); + addEndpointToGroup("Security", "verify-pdf"); addEndpointToGroup("Security", "stamp"); addEndpointToGroup("Security", "sign"); @@ -479,6 +480,8 @@ public class EndpointConfiguration { addEndpointToGroup("Java", "pdf-to-json"); addEndpointToGroup("Java", "json-to-pdf"); addEndpointToGroup("rar", "pdf-to-cbr"); + addEndpointToGroup("Java", "pdf-to-video"); + addEndpointToGroup("Java", "verify-pdf"); // Javascript addEndpointToGroup("Javascript", "pdf-organizer"); @@ -535,6 +538,9 @@ public class EndpointConfiguration { addEndpointToGroup("Weasyprint", "markdown-to-pdf"); addEndpointToGroup("Weasyprint", "eml-to-pdf"); + // veraPDF dependent endpoints + addEndpointToGroup("veraPDF", "verify-pdf"); + // Pdftohtml dependent endpoints addEndpointToGroup("Pdftohtml", "pdf-to-html"); addEndpointToGroup("Pdftohtml", "pdf-to-markdown"); @@ -585,7 +591,9 @@ public class EndpointConfiguration { || "Weasyprint".equals(group) || "Pdftohtml".equals(group) || "ImageMagick".equals(group) - || "rar".equals(group); + || "rar".equals(group) + || "FFmpeg".equals(group) + || "veraPDF".equals(group); } private boolean isEndpointEnabledDirectly(String endpoint) { diff --git a/app/core/build.gradle b/app/core/build.gradle index 9035b44a7..87706537d 100644 --- a/app/core/build.gradle +++ b/app/core/build.gradle @@ -71,6 +71,13 @@ dependencies { implementation "org.apache.pdfbox:preflight:$pdfboxVersion" implementation "org.apache.pdfbox:xmpbox:$pdfboxVersion" + implementation 'org.verapdf:validation-model:1.28.2' + + // veraPDF still uses javax.xml.bind, not the new jakarta namespace + implementation 'javax.xml.bind:jaxb-api:2.3.1' + implementation 'com.sun.xml.bind:jaxb-impl:2.3.9' + implementation 'com.sun.xml.bind:jaxb-core:2.3.0.1' + // https://mvnrepository.com/artifact/technology.tabula/tabula implementation ('technology.tabula:tabula:1.0.5') { exclude group: 'org.slf4j', module: 'slf4j-simple' diff --git a/app/core/src/main/java/stirling/software/SPDF/controller/api/security/VerifyPDFController.java b/app/core/src/main/java/stirling/software/SPDF/controller/api/security/VerifyPDFController.java new file mode 100644 index 000000000..abf7ae0c3 --- /dev/null +++ b/app/core/src/main/java/stirling/software/SPDF/controller/api/security/VerifyPDFController.java @@ -0,0 +1,91 @@ +package stirling.software.SPDF.controller.api.security; + +import java.io.IOException; +import java.util.List; + +import org.springframework.http.MediaType; +import org.springframework.http.ResponseEntity; +import org.springframework.web.bind.annotation.ModelAttribute; +import org.springframework.web.bind.annotation.PostMapping; +import org.springframework.web.bind.annotation.RequestMapping; +import org.springframework.web.bind.annotation.RestController; +import org.springframework.web.multipart.MultipartFile; +import org.verapdf.core.EncryptedPdfException; +import org.verapdf.core.ModelParsingException; +import org.verapdf.core.ValidationException; + +import io.swagger.v3.oas.annotations.Operation; +import io.swagger.v3.oas.annotations.tags.Tag; + +import lombok.RequiredArgsConstructor; +import lombok.extern.slf4j.Slf4j; + +import stirling.software.SPDF.model.api.security.PDFVerificationRequest; +import stirling.software.SPDF.model.api.security.PDFVerificationResult; +import stirling.software.SPDF.service.VeraPDFService; +import stirling.software.common.util.ExceptionUtils; + +@RestController +@RequestMapping("/api/v1/security") +@Tag(name = "Security", description = "Security APIs") +@RequiredArgsConstructor +@Slf4j +public class VerifyPDFController { + + private final VeraPDFService veraPDFService; + + @Operation( + summary = "Verify PDF Standards Compliance", + description = + "Validates PDF files against the standards declared in their metadata. " + + "Automatically detects PDF/A, PDF/UA-1, PDF/UA-2, and WTPDF standards " + + "from the document's XMP metadata and validates compliance. " + + "Input:PDF Output:JSON Type:SISO") + @PostMapping(value = "/verify-pdf", consumes = MediaType.MULTIPART_FORM_DATA_VALUE) + public ResponseEntity> verifyPDF( + @ModelAttribute PDFVerificationRequest request) { + + MultipartFile file = request.getFileInput(); + + if (file == null || file.isEmpty()) { + throw ExceptionUtils.createRuntimeException( + "error.pdfRequired", "PDF file is required", null); + } + + try { + log.info("Detecting and verifying standards in PDF '{}'", file.getOriginalFilename()); + + List results = veraPDFService.validatePDF(file.getInputStream()); + + log.info( + "Verification complete for '{}': {} standard(s) checked", + file.getOriginalFilename(), + results.size()); + + return ResponseEntity.ok(results); + + } catch (ValidationException e) { + log.error("Validation exception for file: {}", file.getOriginalFilename(), e); + throw ExceptionUtils.createRuntimeException( + "error.validationFailed", "PDF validation failed: {0}", e, e.getMessage()); + } catch (ModelParsingException e) { + log.error("Model parsing exception for file: {}", file.getOriginalFilename(), e); + throw ExceptionUtils.createRuntimeException( + "error.modelParsingFailed", "PDF model parsing failed: {0}", e, e.getMessage()); + } catch (EncryptedPdfException e) { + log.error("Encrypted PDF exception for file: {}", file.getOriginalFilename(), e); + throw ExceptionUtils.createRuntimeException( + "error.encryptedPdf", + "Cannot verify encrypted PDF. Please remove password first: {0}", + e, + e.getMessage()); + } catch (IOException e) { + log.error("IO exception for file: {}", file.getOriginalFilename(), e); + throw ExceptionUtils.createRuntimeException( + "error.ioException", + "IO error during PDF verification: {0}", + e, + e.getMessage()); + } + } +} diff --git a/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationRequest.java b/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationRequest.java new file mode 100644 index 000000000..23c0e3c2e --- /dev/null +++ b/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationRequest.java @@ -0,0 +1,10 @@ +package stirling.software.SPDF.model.api.security; + +import lombok.Data; +import lombok.EqualsAndHashCode; + +import stirling.software.common.model.api.PDFFile; + +@Data +@EqualsAndHashCode(callSuper = true) +public class PDFVerificationRequest extends PDFFile {} diff --git a/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationResult.java b/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationResult.java new file mode 100644 index 000000000..9845681b2 --- /dev/null +++ b/app/core/src/main/java/stirling/software/SPDF/model/api/security/PDFVerificationResult.java @@ -0,0 +1,48 @@ +package stirling.software.SPDF.model.api.security; + +import java.util.ArrayList; +import java.util.List; + +import lombok.AllArgsConstructor; +import lombok.Data; +import lombok.NoArgsConstructor; + +@Data +@NoArgsConstructor +@AllArgsConstructor +public class PDFVerificationResult { + + private String standard; + private String standardName; + private String validationProfile; + private String validationProfileName; + private String complianceSummary; + private boolean declaredPdfa; + private boolean compliant; + private int totalFailures; + private int totalWarnings; + private List failures = new ArrayList<>(); + private List warnings = new ArrayList<>(); + + public void addFailure(ValidationIssue failure) { + this.failures.add(failure); + this.totalFailures = this.failures.size(); + } + + public void addWarning(ValidationIssue warning) { + this.warnings.add(warning); + this.totalWarnings = this.warnings.size(); + } + + @Data + @NoArgsConstructor + @AllArgsConstructor + public static class ValidationIssue { + private String ruleId; + private String message; + private String location; + private String specification; + private String clause; + private String testNumber; + } +} diff --git a/app/core/src/main/java/stirling/software/SPDF/service/VeraPDFService.java b/app/core/src/main/java/stirling/software/SPDF/service/VeraPDFService.java new file mode 100644 index 000000000..f590e4373 --- /dev/null +++ b/app/core/src/main/java/stirling/software/SPDF/service/VeraPDFService.java @@ -0,0 +1,319 @@ +package stirling.software.SPDF.service; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; + +import org.springframework.stereotype.Service; +import org.verapdf.core.EncryptedPdfException; +import org.verapdf.core.ModelParsingException; +import org.verapdf.core.ValidationException; +import org.verapdf.gf.foundry.VeraGreenfieldFoundryProvider; +import org.verapdf.pdfa.Foundries; +import org.verapdf.pdfa.PDFAParser; +import org.verapdf.pdfa.PDFAValidator; +import org.verapdf.pdfa.flavours.PDFAFlavour; +import org.verapdf.pdfa.flavours.PDFFlavours; +import org.verapdf.pdfa.results.TestAssertion; +import org.verapdf.pdfa.results.ValidationResult; + +import jakarta.annotation.PostConstruct; + +import lombok.extern.slf4j.Slf4j; + +import stirling.software.SPDF.model.api.security.PDFVerificationResult; + +@Service +@Slf4j +public class VeraPDFService { + + private static final String NOT_PDFA_STANDARD_ID = "not-pdfa"; + private static final String NOT_PDFA_STANDARD_NAME = + "Not PDF/A (no PDF/A identification metadata)"; + + private static PDFVerificationResult convertToVerificationResult( + ValidationResult result, PDFAFlavour declaredFlavour, PDFAFlavour validationFlavour) { + PDFVerificationResult verificationResult = new PDFVerificationResult(); + + PDFAFlavour validationProfile = + validationFlavour != null ? validationFlavour : result.getPDFAFlavour(); + boolean validationIsPdfa = isPdfaFlavour(validationProfile); + + if (validationProfile != null) { + verificationResult.setValidationProfile(validationProfile.getId()); + verificationResult.setValidationProfileName(getStandardName(validationProfile)); + } + + if (declaredFlavour != null) { + verificationResult.setStandard(declaredFlavour.getId()); + verificationResult.setDeclaredPdfa(isPdfaFlavour(declaredFlavour)); + } else if (validationProfile != null && !validationIsPdfa) { + verificationResult.setStandard(validationProfile.getId()); + verificationResult.setDeclaredPdfa(false); + } else { + verificationResult.setStandard(NOT_PDFA_STANDARD_ID); + verificationResult.setDeclaredPdfa(false); + } + + for (TestAssertion assertion : result.getTestAssertions()) { + if (assertion.getStatus() == TestAssertion.Status.FAILED) { + PDFVerificationResult.ValidationIssue issue = createValidationIssue(assertion); + verificationResult.addFailure(issue); + } + } + + verificationResult.setCompliant(result.isCompliant()); + + String baseName; + if (declaredFlavour != null) { + baseName = getStandardName(declaredFlavour); + } else if (validationIsPdfa) { + baseName = NOT_PDFA_STANDARD_NAME; + } else if (validationProfile != null) { + baseName = getStandardName(validationProfile); + } else { + baseName = "Unknown standard"; + } + + String standardDisplay = + formatStandardDisplay( + baseName, + verificationResult.getTotalFailures(), + isPdfaFlavour(declaredFlavour), + validationIsPdfa && declaredFlavour == null); + verificationResult.setStandardName(standardDisplay); + verificationResult.setComplianceSummary(standardDisplay); + + log.debug( + "Validation complete for profile {} (declared: {}): {} failures", + validationProfile != null ? validationProfile.getId() : "unknown", + declaredFlavour != null ? declaredFlavour.getId() : NOT_PDFA_STANDARD_ID, + verificationResult.getTotalFailures()); + + return verificationResult; + } + + private static PDFVerificationResult.ValidationIssue createValidationIssue( + TestAssertion assertion) { + PDFVerificationResult.ValidationIssue issue = new PDFVerificationResult.ValidationIssue(); + + if (assertion.getRuleId() != null) { + issue.setRuleId(assertion.getRuleId().toString()); + issue.setClause(assertion.getRuleId().getClause()); + + if (assertion.getRuleId().getSpecification() != null) { + issue.setSpecification(assertion.getRuleId().getSpecification().toString()); + } + + int testNumber = assertion.getRuleId().getTestNumber(); + if (testNumber > 0) { + issue.setTestNumber(String.valueOf(testNumber)); + } + } + + issue.setMessage(assertion.getMessage()); + issue.setLocation( + assertion.getLocation() != null ? assertion.getLocation().toString() : "Unknown"); + + return issue; + } + + private static PDFVerificationResult createNoPdfaDeclarationResult() { + PDFVerificationResult result = new PDFVerificationResult(); + result.setStandard(NOT_PDFA_STANDARD_ID); + result.setStandardName(NOT_PDFA_STANDARD_NAME); + result.setComplianceSummary(NOT_PDFA_STANDARD_NAME); + result.setCompliant(false); + result.setDeclaredPdfa(false); + + PDFVerificationResult.ValidationIssue issue = new PDFVerificationResult.ValidationIssue(); + issue.setMessage("Document does not declare PDF/A compliance in its XMP metadata."); + issue.setSpecification("XMP pdfaid"); + result.addFailure(issue); + + return result; + } + + private static PDFVerificationResult buildErrorResult( + PDFAFlavour declaredFlavour, PDFAFlavour validationFlavour, String errorMessage) { + + PDFVerificationResult errorResult = new PDFVerificationResult(); + + PDFAFlavour declaredForResult = + isPdfaFlavour(validationFlavour) ? declaredFlavour : validationFlavour; + + if (declaredForResult != null) { + errorResult.setStandard(declaredForResult.getId()); + errorResult.setStandardName(getStandardName(declaredForResult) + " with errors"); + errorResult.setDeclaredPdfa(isPdfaFlavour(declaredForResult)); + } else if (isPdfaFlavour(validationFlavour)) { + errorResult.setStandard(NOT_PDFA_STANDARD_ID); + errorResult.setStandardName(NOT_PDFA_STANDARD_NAME); + errorResult.setDeclaredPdfa(false); + } else { + errorResult.setStandard( + validationFlavour != null ? validationFlavour.getId() : NOT_PDFA_STANDARD_ID); + errorResult.setStandardName( + (validationFlavour != null + ? getStandardName(validationFlavour) + : "Unknown standard") + + " with errors"); + errorResult.setDeclaredPdfa(false); + } + + errorResult.setValidationProfile( + validationFlavour != null ? validationFlavour.getId() : NOT_PDFA_STANDARD_ID); + errorResult.setValidationProfileName( + validationFlavour != null + ? getStandardName(validationFlavour) + : "Unknown standard"); + errorResult.setComplianceSummary(errorResult.getStandardName()); + errorResult.setCompliant(false); + + PDFVerificationResult.ValidationIssue failure = new PDFVerificationResult.ValidationIssue(); + failure.setMessage(errorMessage); + errorResult.addFailure(failure); + + return errorResult; + } + + @PostConstruct + public void initialize() { + try { + VeraGreenfieldFoundryProvider.initialise(); + log.info("VeraPDF Greenfield initialized successfully"); + } catch (Exception e) { + log.error("Failed to initialize VeraPDF", e); + } + } + + public List validatePDF(InputStream pdfStream) + throws IOException, ValidationException, ModelParsingException, EncryptedPdfException { + + byte[] pdfBytes = pdfStream.readAllBytes(); + List results = new ArrayList<>(); + + PDFAFlavour declaredFlavour; + List detectedFlavours; + + try (PDFAParser detectionParser = + Foundries.defaultInstance().createParser(new ByteArrayInputStream(pdfBytes))) { + declaredFlavour = detectionParser.getFlavour(); + detectedFlavours = detectionParser.getFlavours(); + } + + List flavoursToValidate = new ArrayList<>(); + boolean hasPdfaDeclaration = isPdfaFlavour(declaredFlavour); + + if (declaredFlavour != null) { + flavoursToValidate.add(declaredFlavour); + } + + for (PDFAFlavour flavour : detectedFlavours) { + if (flavour.equals(declaredFlavour)) { + continue; + } + + if (PDFFlavours.isFlavourFamily(flavour, PDFAFlavour.SpecificationFamily.PDF_A)) { + if (hasPdfaDeclaration) { + flavoursToValidate.add(flavour); + } else { + log.debug( + "Ignoring detected PDF/A flavour {} because no PDF/A declaration exists in XMP", + flavour.getId()); + } + } else if (PDFFlavours.isFlavourFamily(flavour, PDFAFlavour.SpecificationFamily.PDF_UA) + || PDFFlavours.isFlavourFamily( + flavour, PDFAFlavour.SpecificationFamily.WTPDF)) { + flavoursToValidate.add(flavour); + } + } + + if (!hasPdfaDeclaration) { + results.add(createNoPdfaDeclarationResult()); + } + + if (flavoursToValidate.isEmpty()) { + log.info("No verifiable PDF/A, PDF/UA, or WTPDF standards declared via XMP metadata"); + return results; + } + + for (PDFAFlavour flavour : flavoursToValidate) { + try (PDFAParser parser = + Foundries.defaultInstance() + .createParser(new ByteArrayInputStream(pdfBytes), flavour)) { + + PDFAFlavour parserDeclared = parser.getFlavour(); + PDFAValidator validator = + Foundries.defaultInstance().createValidator(flavour, false); + ValidationResult result = validator.validate(parser); + + PDFAFlavour declaredForResult = + PDFFlavours.isFlavourFamily(flavour, PDFAFlavour.SpecificationFamily.PDF_A) + ? parserDeclared + : flavour; + + results.add(convertToVerificationResult(result, declaredForResult, flavour)); + } catch (Exception e) { + log.error("Error validating standard {}: {}", flavour.getId(), e.getMessage()); + results.add( + buildErrorResult( + declaredFlavour, flavour, "Validation error: " + e.getMessage())); + } + } + + return results; + } + + private static boolean isPdfaFlavour(PDFAFlavour flavour) { + return PDFFlavours.isFlavourFamily(flavour, PDFAFlavour.SpecificationFamily.PDF_A); + } + + private static String formatStandardDisplay( + String baseName, + int errorCount, + boolean declaredPdfa, + boolean inferredPdfaWithoutDeclaration) { + + if (inferredPdfaWithoutDeclaration) { + return NOT_PDFA_STANDARD_NAME; + } + + if (!declaredPdfa && NOT_PDFA_STANDARD_NAME.equals(baseName)) { + return NOT_PDFA_STANDARD_NAME; + } + + if (errorCount > 0) { + return baseName + " with errors"; + } + + return baseName + " compliant"; + } + + private static String getStandardName(PDFAFlavour flavour) { + String id = flavour.getId(); + String part = flavour.getPart().toString(); + String level = flavour.getLevel().toString(); + + // PDF/A standards - Fixed: proper length check and parentheses + if (!id.isEmpty() + && (id.charAt(0) == '1' + || id.charAt(0) == '2' + || id.charAt(0) == '3' + || id.charAt(0) == '4')) { + return "PDF/A-" + part + (level.isEmpty() ? "" : level); + } + // PDF/UA standards + else if (id.contains("ua")) { + return "PDF/UA-" + part; + } + // WTPDF standards + else if (id.contains("wtpdf")) { + return "WTPDF " + part; + } + + return flavour.toString(); + } +}