mirror of
https://github.com/Frooodle/Stirling-PDF.git
synced 2026-02-17 13:52:14 +01:00
feat(get-info-on-pdf): use PDFBox preflight to validate PDF compliancy level, and parse in compliancy type (#4595)
# Description of Changes - Refactored methods for parsing and extracting PDF/A conformance levels from XMP metadata. - Implemented PDF/A validation using Preflight from Apache PDFBox. - Enhanced PDF information generation to include PDF/A conformance level and validation results. - Updated compliance checks and JSON output to reflect new PDF/A capabilities. ### Test files: [lorem-ipsum_PDFA1b.pdf](https://github.com/user-attachments/files/22687689/lorem-ipsum_PDFA1b.pdf) [lorem-ipsum_PDFA_2b.pdf](https://github.com/user-attachments/files/22687692/lorem-ipsum_PDFA_2b.pdf) [lorem-ipsum_PD⁄A3a.pdf](https://github.com/user-attachments/files/22687693/lorem-ipsum_PD.A3a.pdf) ### New results: <img width="699" height="257" alt="image" src="https://github.com/user-attachments/assets/b8cb5510-2908-4e08-97f6-d5799e0e1be7" /> <img width="699" height="257" alt="image" src="https://github.com/user-attachments/assets/d7af3731-ad19-4524-b1c1-32f47776e6af" /> <img width="699" height="257" alt="image" src="https://github.com/user-attachments/assets/6e48e65b-2ebc-402a-a222-bfdbf783e45d" /> I also validated with online tools. Should be good now! I was also thinking moving this to GeneralUtils; it may be useful for PDF/A converter in the future, or for other features. Not sure yet, for now I think this is good for now. Closes #4568 <!-- Please provide a summary of the changes, including: - What was changed - Why the change was made - Any challenges encountered Closes #(issue_number) --> --- ## Checklist ### General - [x] I have read the [Contribution Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md) - [x] I have read the [Stirling-PDF Developer Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md) (if applicable) - [ ] I have read the [How to add new languages to Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md) (if applicable) - [x] I have performed a self-review of my own code - [x] My changes generate no new warnings ### Documentation - [ ] I have updated relevant docs on [Stirling-PDF's doc repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/) (if functionality has heavily changed) - [ ] I have read the section [Add New Translation Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags) (for new translation tags only) ### UI Changes (if applicable) - [x] Screenshots or videos demonstrating the UI changes are attached (e.g., as comments or direct attachments in the PR) ### Testing (if applicable) - [x] I have tested my changes locally. Refer to the [Testing Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing) for more details. --------- Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
This commit is contained in:
@@ -447,7 +447,20 @@ public final class RegexPatternUtils {
|
||||
return getPattern("@\\s*([^\\s\\(]+(?:\\.[a-zA-Z0-9]+)?)");
|
||||
}
|
||||
|
||||
// API doc parsing patterns
|
||||
/** Pattern for matching pdfaid:part attribute in XMP metadata */
|
||||
public Pattern getPdfAidPartPattern() {
|
||||
return getPattern("pdfaid:part[\"\\s]*=[\"\\s]*([0-9]+)");
|
||||
}
|
||||
|
||||
/** Pattern for matching pdfaid:conformance attribute in XMP metadata */
|
||||
public Pattern getPdfAidConformancePattern() {
|
||||
return getPattern("pdfaid:conformance[\"\\s]*=[\"\\s]*([A-Za-z]+)");
|
||||
}
|
||||
|
||||
/** Pattern for matching slash in page mode description */
|
||||
public Pattern getPageModePattern() {
|
||||
return getPattern("/");
|
||||
}
|
||||
|
||||
/**
|
||||
* Pre-compile commonly used patterns for immediate availability. This eliminates first-call
|
||||
|
||||
Reference in New Issue
Block a user