feat(conversion): refactor EML parser to use Simple Java Mail library and add MSG support (#5427)

# Description of Changes


Note on Simple Java Mail:
- SJM contains Angus/Jakarta Mail in it.
- SJM is a very thin layer on Angus Mail; see here:
https://github.com/bbottema/simple-java-mail
- SJM gives high level methods to more reliably parse in email via Angus
Mail, but also contains lots of other interesting features.
- SJM is Apache 2 licensed

This pull request updates the email processing utilities to add support
for parsing and validating Outlook MSG files, refactors the
`EmlProcessingUtils` utility class to use instance methods and improved
resource management, and enhances the handling and styling of generated
email HTML. The changes also introduce external CSS resource loading
with a fallback mechanism, and update dependencies to support MSG file
parsing.

**MSG file support and validation:**
- Added `simple-java-mail` and `outlook-module` dependencies to enable
EML and MSG file parsing, and updated validation logic to recognize and
accept MSG files by checking their magic bytes.
(`app/common/build.gradle`, `EmlProcessingUtils.java`)
**Refactoring and modernization of `EmlProcessingUtils`:**
- Converted static methods and fields in `EmlProcessingUtils` to
instance methods/fields, improving testability and future extensibility.
(`EmlProcessingUtils.java`)

**Enhanced HTML/CSS styling for email rendering:**
- Updated HTML generation to use consistent formatting and improved
style variable usage, and refactored CSS injection to load from an
external resource (`email-pdf-styles.css`) with a synchronized cache and
a minimal fallback if the resource is missing.
(`EmlProcessingUtils.java`)
**Attachment and content rendering improvements:**
- Improved the formatting of meta-information (e.g., CC, BCC, Date) and
attachment sections in generated email HTML, and ensured more robust
handling of empty or missing content. (`EmlProcessingUtils.java`)

**General code cleanup and logging:**
- Added SLF4J logging for error handling when loading CSS resources, and
cleaned up imports and method signatures for clarity and
maintainability. (`EmlProcessingUtils.java`)


<img width="367" height="991" alt="image"
src="https://github.com/user-attachments/assets/0cfb959c-da92-4cff-9e52-ff4ab7fa806e"
/>


<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [X] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [X] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [X] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [X] I have performed a self-review of my own code
- [X] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [X] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
This commit is contained in:
Balázs Szücs
2026-01-13 22:17:40 +01:00
committed by GitHub
parent daf27b6128
commit 84ed1d7ecb
12 changed files with 764 additions and 779 deletions

View File

@@ -42,12 +42,12 @@ public class ConvertEmlToPDF {
@AutoJobPostMapping(consumes = MediaType.MULTIPART_FORM_DATA_VALUE, value = "/eml/pdf")
@StandardPdfResponse
@Operation(
summary = "Convert EML to PDF",
summary = "Convert EML/MSG to PDF",
description =
"This endpoint converts EML (email) files to PDF format with extensive"
+ " customization options. Features include font settings, image"
+ " constraints, display modes, attachment handling, and HTML debug output."
+ " Input: EML file, Output: PDF or HTML file. Type: SISO")
"This endpoint converts EML (email) and MSG (Outlook) files to PDF format"
+ " with extensive customization options. Features include font settings,"
+ " image constraints, display modes, attachment handling, and HTML debug"
+ " output. Input: EML or MSG file, Output: PDF or HTML file. Type: SISO")
public ResponseEntity<byte[]> convertEmlToPdf(@ModelAttribute EmlToPdfRequest request) {
MultipartFile inputFile = request.getFileInput();
@@ -55,7 +55,7 @@ public class ConvertEmlToPDF {
// Validate input
if (inputFile.isEmpty()) {
log.error("No file provided for EML to PDF conversion.");
log.error("No file provided for EML/MSG to PDF conversion.");
return ResponseEntity.badRequest()
.body("No file provided".getBytes(StandardCharsets.UTF_8));
}
@@ -66,12 +66,12 @@ public class ConvertEmlToPDF {
.body("Please provide a valid filename".getBytes(StandardCharsets.UTF_8));
}
// Validate file type - support EML
// Validate file type - support EML and MSG (Outlook) files
String lowerFilename = originalFilename.toLowerCase(Locale.ROOT);
if (!lowerFilename.endsWith(".eml")) {
log.error("Invalid file type for EML to PDF: {}", originalFilename);
if (!lowerFilename.endsWith(".eml") && !lowerFilename.endsWith(".msg")) {
log.error("Invalid file type for EML/MSG to PDF: {}", originalFilename);
return ResponseEntity.badRequest()
.body("Please upload a valid EML file".getBytes(StandardCharsets.UTF_8));
.body("Please upload a valid EML or MSG file".getBytes(StandardCharsets.UTF_8));
}
String baseFilename = Filenames.toSimpleFileName(originalFilename); // Use Filenames utility
@@ -82,7 +82,7 @@ public class ConvertEmlToPDF {
if (request.isDownloadHtml()) {
try {
String htmlContent = EmlToPdf.convertEmlToHtml(fileBytes, request);
log.info("Successfully converted EML to HTML: {}", originalFilename);
log.info("Successfully converted email to HTML: {}", originalFilename);
return WebResponseUtils.bytesToWebResponse(
htmlContent.getBytes(StandardCharsets.UTF_8),
baseFilename + ".html",
@@ -96,12 +96,11 @@ public class ConvertEmlToPDF {
}
}
// Convert EML to PDF with enhanced options
// Convert EML/MSG to PDF with enhanced options
try {
byte[] pdfBytes =
EmlToPdf.convertEmlToPdf(
runtimePathConfig
.getWeasyPrintPath(), // Use configured WeasyPrint path
runtimePathConfig.getWeasyPrintPath(),
request,
fileBytes,
originalFilename,
@@ -116,19 +115,19 @@ public class ConvertEmlToPDF {
"PDF conversion failed - empty output"
.getBytes(StandardCharsets.UTF_8));
}
log.info("Successfully converted EML to PDF: {}", originalFilename);
log.info("Successfully converted email to PDF: {}", originalFilename);
return WebResponseUtils.bytesToWebResponse(
pdfBytes, baseFilename + ".pdf", MediaType.APPLICATION_PDF);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
log.error("EML to PDF conversion was interrupted for {}", originalFilename, e);
log.error("Email to PDF conversion was interrupted for {}", originalFilename, e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("Conversion was interrupted".getBytes(StandardCharsets.UTF_8));
} catch (IllegalArgumentException e) {
String errorMessage = buildErrorMessage(e, originalFilename);
log.error(
"EML to PDF conversion failed for {}: {}",
"Email to PDF conversion failed for {}: {}",
originalFilename,
errorMessage,
e);
@@ -137,7 +136,7 @@ public class ConvertEmlToPDF {
} catch (RuntimeException e) {
String errorMessage = buildErrorMessage(e, originalFilename);
log.error(
"EML to PDF conversion failed for {}: {}",
"Email to PDF conversion failed for {}: {}",
originalFilename,
errorMessage,
e);
@@ -146,7 +145,7 @@ public class ConvertEmlToPDF {
}
} catch (IOException e) {
log.error("File processing error for EML to PDF: {}", originalFilename, e);
log.error("File processing error for email to PDF: {}", originalFilename, e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body("File processing error".getBytes(StandardCharsets.UTF_8));
}