mirror of
https://github.com/Frooodle/Stirling-PDF.git
synced 2026-02-01 20:10:35 +01:00
# Description of Changes - Implemented `PDFVerificationRequest` and `PDFVerificationResult` models for validation requests and responses - Developed `VeraPDFService` to validate PDFs against specific or auto-detected standards - Added `VerifyPDFController` with an endpoint for PDF verification - Integrated veraPDF dependencies into project build file - Deprecated unused `/verify-pdf` form in `SecurityWebController` - Updated `EndpointConfiguration` to include the new `verify-pdf` endpoint This PR introduces a PDF standards verification feature to Stirling-PDF, powered by the industry-standard veraPDF validation library. This feature enables users to validate PDF files against multiple PDF standards including PDF/A (archival), PDF/UA (accessibility), and WTPDF standards. ### 1. PDF Standards Verification Endpoint - New API Endpoint: `/api/v1/security/verify-pdf` - Validates PDF files against multiple standards: - PDF/A (1b, 2a, 2b, 2u, 3a, 3b, 3u, 4, 4e, 4f) - Archival standards - PDF/UA-1 and PDF/UA-2 - Universal Accessibility standards - WTPDF - Well-Tagged PDF standards - Auto-detection: Automatically detects and validates all standards declared in the PDF's XMP metadata ### 2. Validation Results The verification returns detailed JSON results including: - Compliance status: Whether the PDF meets the standard requirements - Declared vs validated standards: Shows what the PDF claims to be vs what it actually is - Categorized issues: - Errors: Critical compliance failures that prevent certification - Warnings: Non-critical issues and recommendations - Detailed issue information: - Rule IDs from the specification - Descriptive error messages - Location within the PDF where the issue occurs - Specification references (clause numbers, test numbers) ### 3. Detect Issue Classification Implements intelligent classification of validation issues: - Errors: Issues that prevent standard compliance (font problems, color space issues, structural problems) - Warnings: Recommended but not required elements (metadata recommendations, optional features) - Classification based on: - Rule ID patterns - Clause number prefixes - Message content analysis ### New Files Added #### Controllers - VerifyPDFController.java: REST API controller handling PDF verification requests - Handles multipart file uploads - Supports both single-standard and auto-detection modes - Comprehensive error handling for encrypted PDFs, parsing errors, and validation failures #### Models - PDFVerificationRequest.java: Request model for verification API - Extends standard PDFFile model - Optional `standard` parameter for manual standard selection - PDFVerificationResult.java: Response model containing validation results - Includes standard information and validation profile details - Separate lists for errors and warnings - Nested `ValidationIssue` class for detailed issue reporting #### Services - VeraPDFService.java: Core service implementing veraPDF integration - Initializes veraPDF Greenfield engine - Extracts declared PDF/A standards from XMP metadata - Performs validation against specified or detected standards - Converts veraPDF results to application-specific format - Implements smart issue classification logic ### Endpoint Configuration Updates #### EndpointConfiguration.java - Added `verify-pdf` to the Security group - Added `verify-pdf` to the Java group (no external tools required) - Created new veraPDF dependency group for endpoint availability tracking - Updated `isToolGroup()` method to recognize veraPDF as a tool dependency ### Supported Standards #### PDF/A (Archival) - PDF/A-1 (a, b): ISO 19005-1:2005 - PDF/A-2 (a, b, u): ISO 19005-2:2011 - PDF/A-3 (a, b, u): ISO 19005-3:2012 - PDF/A-4 (standard, e, f): ISO 19005-4:2020 #### PDF/UA (Universal Accessibility) - PDF/UA-1: ISO 14289-1:2014 - PDF/UA-2: ISO 14289-2 (latest) #### WTPDF (Well-Tagged PDF) - WTPDF 1.0: Tagged PDF for accessibility and structure ### Security Considerations The following test scenarios should be validated: 1. Valid PDF/A documents (should return compliant) 2. Non-compliant PDF/A documents (should return errors) 3. PDFs without PDF/A declaration (should detect and report) 4. PDF/UA documents (should validate accessibility) 5. Encrypted PDFs (should return appropriate error) 6. Mixed standards (PDF/A + PDF/UA) (should validate both) 7. Empty standard parameter (should auto-detect) 8. Invalid standard parameter (should return error) ### API Usage Examples ```bash curl -X POST http://localhost:8080/api/v1/security/verify-pdf \ -F "fileInput=@document.pdf" ``` ### Example Response ```json [ { "standard": "3b", "standardName": "PDF/A-ISO 19005-3:2012B compliant", "validationProfile": "3b", "validationProfileName": "PDF/A-ISO 19005-3:2012B", "complianceSummary": "PDF/A-ISO 19005-3:2012B compliant", "declaredPdfa": true, "compliant": true, "totalFailures": 0, "totalWarnings": 0, "failures": [], "warnings": [] } ] ``` ```json [ { "standard": "2b", "standardName": "PDF/A-ISO 19005-2:2011B with errors", "validationProfile": "2b", "validationProfileName": "PDF/A-ISO 19005-2:2011B", "complianceSummary": "PDF/A-ISO 19005-2:2011B with errors", "declaredPdfa": true, "compliant": false, "totalFailures": 2, "totalWarnings": 0, "failures": [ { "ruleId": "RuleId [specification=ISO 19005-2:2011, clause=6.2.11.4.1, testNumber=1]", "message": "The font programs for all fonts used for rendering within a conforming file shall be embedded within that file, as defined in ISO 32000-1:2008, 9.9", "location": "Location [level=CosDocument, context=root/document[0]/pages[0](3 0 obj PDPage)/contentStream[0](105 0 obj PDContentStream)/operators[60]/font[0](ArialMT)]", "specification": "ISO 19005-2:2011", "clause": "6.2.11.4.1", "testNumber": "1" }, { "ruleId": "RuleId [specification=ISO 19005-2:2011, clause=6.3.2, testNumber=1]", "message": "Except for annotation dictionaries whose Subtype value is Popup, all annotation dictionaries shall contain the F key", "location": "Location [level=CosDocument, context=root/document[0]/pages[0](3 0 obj PDPage)/annots[4](107 0 obj PDLinkAnnot)]", "specification": "ISO 19005-2:2011", "clause": "6.3.2", "testNumber": "1" } ], "warnings": [] } ] ``` <!-- Please provide a summary of the changes, including: - What was changed - Why the change was made - Any challenges encountered Closes #(issue_number) --> --- ## Checklist ### General - [ ] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [ ] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [ ] I have performed a self-review of my own code - [ ] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### Translations (if applicable) - [ ] I ran [`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md) ### UI Changes (if applicable) - [ ] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [ ] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details. --------- Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
193 lines
4.3 KiB
JSON
193 lines
4.3 KiB
JSON
{
|
|
"allowedLicenses": [
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The BSD License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD-2-Clause"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD 2-Clause License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The 2-Clause BSD License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD-3-Clause"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The BSD 3-Clause License (BSD3)"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD-4 License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "MIT"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "MIT License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The MIT License"
|
|
},
|
|
{
|
|
"moduleName": "com.github.jai-imageio:jai-imageio-core",
|
|
"moduleLicense": "LICENSE.txt"
|
|
},
|
|
{
|
|
"moduleName": "com.github.jai-imageio:jai-imageio-jpeg2000",
|
|
"moduleLicense": "LICENSE-JJ2000.txt, LICENSE-Sun.txt"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache 2"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache-2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache-2.0 License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache License 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache License Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Apache License, Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The Apache License, Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The Apache Software License, Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": "com.nimbusds:oauth2-oidc-sdk",
|
|
"moduleLicense": "\"Apache License, version 2.0\";link=\"https://www.apache.org/licenses/LICENSE-2.0.html\""
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "MPL 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Mozilla Public License, Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Mozilla Public License 2.0 (MPL-2.0)"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "CDDL+GPL License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "BSD"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "UnboundID SCIM2 SDK Free Use License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "GPL2 w/ CPE"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "GPLv2+CE"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "GNU GENERAL PUBLIC LICENSE, Version 2 + Classpath Exception"
|
|
},
|
|
{
|
|
"moduleName": "com.martiansoftware:jsap",
|
|
"moduleLicense": "LGPL"
|
|
},
|
|
{
|
|
"moduleName": "org.hibernate.orm:hibernate-core",
|
|
"moduleLicense": "GNU Library General Public License v2.1 or later"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License 1.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License - v 1.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License v2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License v. 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License - v 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License - Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Eclipse Public License, Version 2.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Ubuntu Font Licence 1.0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Bouncy Castle Licence"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "Public Domain, per Creative Commons CC0"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "The W3C License"
|
|
},
|
|
{
|
|
"moduleName": ".*",
|
|
"moduleLicense": "UnRar License"
|
|
}
|
|
]
|
|
}
|