feat(crop): add auto-crop functionality to detect and remove white space (#4847)

# Description of Changes

TLDR:
- Implemented automatic content bounds detection in CropController
- Added new auto-crop option in CropPdfForm and HTML template
- Updated JavaScript for better PDF rendering and interaction handling
- Also updated error handling and resource management in cropping
functions

Introduces auto-crop feature for automatic whitespace detection and
removal, plus resource management improvements.

### Auto-Crop Detection
- Automatically detects and removes whitespace using content boundary
detection
- Adaptive pixel sampling for images >2000px
- Sequential page processing for memory efficiency
- New opt-in checkbox in crop interface
- 150 DPI rendering for accurate detection

### Resource Management
- Fixed document lifecycle to prevent resource access violations
- Source documents remain open during LayerUtility operations
- Proper nested try-with-resources implementation
- Prevents crashes with `LayerUtility.importPageAsForm()`

### Memory Optimization
- Early exit for images <1px
- Adaptive sampling: O(n) instead of O(n²) for large pages
- Explicit BufferedImage nulling after processing

### Error Handling
- Proper try-with-resources for temporary files
- Defensive null checks and interrupt handling
- Updated exception messages

### CropPdfForm
- Changed coordinates from primitive to wrapper types (Float) for
nullable support
- Added `autoCrop` boolean field (default false)
- Maintains backward compatibility


### crop.js
- Track and cancel render tasks to prevent memory leaks
- Improved resize handling without file reloading
- Fixed nested event listeners
- Canvas clearing before rendering prevents artifacts
- DOM layout stability improvements


### Front-end

<img width="956" height="802" alt="image"
src="https://github.com/user-attachments/assets/2e8e5bd2-4948-4df1-9937-3358b36d03a0"
/>


### Samples:

<img width="960" height="775" alt="image"
src="https://github.com/user-attachments/assets/b27d3c65-1517-4318-b3d2-ca2d9864abf9"
/>


[test_cropped-2.pdf](https://github.com/user-attachments/files/23436674/test_cropped-2.pdf)

<img width="960" height="775" alt="image"
src="https://github.com/user-attachments/assets/095374fb-9e89-4ea1-a5c7-4287c909e20a"
/>


[pdf_hyperlink_example_cropped.pdf](https://github.com/user-attachments/files/23436678/pdf_hyperlink_example_cropped.pdf)

<img width="960" height="775" alt="image"
src="https://github.com/user-attachments/assets/b01e3633-15b7-4eea-a99b-09d875a380b4"
/>


[Sample-Fillable-PDF_cropped.pdf](https://github.com/user-attachments/files/23436680/Sample-Fillable-PDF_cropped.pdf)

<img width="960" height="858" alt="image"
src="https://github.com/user-attachments/assets/74824e8f-2f45-4e9d-9bd3-ed1248146f81"
/>


[sample-2_cropped.pdf](https://github.com/user-attachments/files/23436684/sample-2_cropped.pdf)

Closes: #1351

<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [X] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [X] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [X] I have performed a self-review of my own code
- [X] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [X] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [X] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
This commit is contained in:
Balázs Szücs 2025-11-17 12:53:15 +01:00 committed by GitHub
parent 7acc716b80
commit c895b09142
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
6 changed files with 979 additions and 93 deletions

View File

@ -1,5 +1,6 @@
package stirling.software.SPDF.controller.api;
import java.awt.image.BufferedImage;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.file.Files;
@ -13,6 +14,7 @@ import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageContentStream.AppendMode;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.form.PDFormXObject;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.ModelAttribute;
@ -39,8 +41,86 @@ import stirling.software.common.util.WebResponseUtils;
@Slf4j
public class CropController {
private static final int DEFAULT_RENDER_DPI = 150;
private static final int WHITE_THRESHOLD = 250;
private static final String TEMP_INPUT_PREFIX = "crop_input";
private static final String TEMP_OUTPUT_PREFIX = "crop_output";
private static final String PDF_EXTENSION = ".pdf";
private final CustomPDFDocumentFactory pdfDocumentFactory;
private static int[] detectContentBounds(BufferedImage image) {
int width = image.getWidth();
int height = image.getHeight();
// Early exit if image is too small
if (width < 1 || height < 1) {
return new int[] {0, 0, width - 1, height - 1};
}
// Sample every nth pixel for large images to reduce processing time
int step = (width > 2000 || height > 2000) ? 2 : 1;
int top = 0;
boolean found = false;
for (int y = 0; y < height && !found; y += step) {
for (int x = 0; x < width; x += step) {
if (!isWhite(image.getRGB(x, y), WHITE_THRESHOLD)) {
top = y;
found = true;
break;
}
}
}
int bottom = height - 1;
found = false;
for (int y = height - 1; y >= 0 && !found; y -= step) {
for (int x = 0; x < width; x += step) {
if (!isWhite(image.getRGB(x, y), WHITE_THRESHOLD)) {
bottom = y;
found = true;
break;
}
}
}
int left = 0;
found = false;
for (int x = 0; x < width && !found; x += step) {
for (int y = top; y <= bottom; y += step) {
if (!isWhite(image.getRGB(x, y), WHITE_THRESHOLD)) {
left = x;
found = true;
break;
}
}
}
int right = width - 1;
found = false;
for (int x = width - 1; x >= 0 && !found; x -= step) {
for (int y = top; y <= bottom; y += step) {
if (!isWhite(image.getRGB(x, y), WHITE_THRESHOLD)) {
right = x;
found = true;
break;
}
}
}
// Return bounds in format: [left, bottom, right, top]
// Note: Image coordinates are top-down, PDF coordinates are bottom-up
return new int[] {left, height - bottom - 1, right, height - top - 1};
}
private static boolean isWhite(int rgb, int threshold) {
int r = (rgb >> 16) & 0xFF;
int g = (rgb >> 8) & 0xFF;
int b = rgb & 0xFF;
return r >= threshold && g >= threshold && b >= threshold;
}
@PostMapping(value = "/crop", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
@Operation(
summary = "Crops a PDF document",
@ -48,6 +128,18 @@ public class CropController {
"This operation takes an input PDF file and crops it according to the given"
+ " coordinates. Input:PDF Output:PDF Type:SISO")
public ResponseEntity<byte[]> cropPdf(@ModelAttribute CropPdfForm request) throws IOException {
if (request.isAutoCrop()) {
return cropWithAutomaticDetection(request);
}
if (request.getX() == null
|| request.getY() == null
|| request.getWidth() == null
|| request.getHeight() == null) {
throw new IllegalArgumentException(
"Crop coordinates (x, y, width, height) are required when auto-crop is not enabled");
}
if (request.isRemoveDataOutsideCrop()) {
return cropWithGhostscript(request);
} else {
@ -55,86 +147,145 @@ public class CropController {
}
}
private ResponseEntity<byte[]> cropWithAutomaticDetection(@ModelAttribute CropPdfForm request)
throws IOException {
try (PDDocument sourceDocument = pdfDocumentFactory.load(request)) {
try (PDDocument newDocument =
pdfDocumentFactory.createNewDocumentBasedOnOldDocument(sourceDocument)) {
PDFRenderer renderer = new PDFRenderer(sourceDocument);
LayerUtility layerUtility = new LayerUtility(newDocument);
for (int i = 0; i < sourceDocument.getNumberOfPages(); i++) {
PDPage sourcePage = sourceDocument.getPage(i);
PDRectangle mediaBox = sourcePage.getMediaBox();
BufferedImage image = renderer.renderImageWithDPI(i, DEFAULT_RENDER_DPI);
int[] bounds = detectContentBounds(image);
float scaleX = mediaBox.getWidth() / image.getWidth();
float scaleY = mediaBox.getHeight() / image.getHeight();
CropBounds cropBounds = CropBounds.fromPixels(bounds, scaleX, scaleY);
PDPage newPage = new PDPage(mediaBox);
newDocument.addPage(newPage);
try (PDPageContentStream contentStream =
new PDPageContentStream(
newDocument, newPage, AppendMode.OVERWRITE, true, true)) {
PDFormXObject formXObject =
layerUtility.importPageAsForm(sourceDocument, i);
contentStream.saveGraphicsState();
contentStream.addRect(
cropBounds.x, cropBounds.y, cropBounds.width, cropBounds.height);
contentStream.clip();
contentStream.drawForm(formXObject);
contentStream.restoreGraphicsState();
}
newPage.setMediaBox(
new PDRectangle(
cropBounds.x,
cropBounds.y,
cropBounds.width,
cropBounds.height));
}
ByteArrayOutputStream baos = new ByteArrayOutputStream();
newDocument.save(baos);
byte[] pdfContent = baos.toByteArray();
return WebResponseUtils.bytesToWebResponse(
pdfContent,
GeneralUtils.generateFilename(
request.getFileInput().getOriginalFilename(), "_cropped.pdf"));
}
}
}
private ResponseEntity<byte[]> cropWithPDFBox(@ModelAttribute CropPdfForm request)
throws IOException {
PDDocument sourceDocument = pdfDocumentFactory.load(request);
try (PDDocument sourceDocument = pdfDocumentFactory.load(request)) {
PDDocument newDocument =
pdfDocumentFactory.createNewDocumentBasedOnOldDocument(sourceDocument);
try (PDDocument newDocument =
pdfDocumentFactory.createNewDocumentBasedOnOldDocument(sourceDocument)) {
int totalPages = sourceDocument.getNumberOfPages();
LayerUtility layerUtility = new LayerUtility(newDocument);
int totalPages = sourceDocument.getNumberOfPages();
for (int i = 0; i < totalPages; i++) {
PDPage sourcePage = sourceDocument.getPage(i);
LayerUtility layerUtility = new LayerUtility(newDocument);
// Create a new page with the size of the source page
PDPage newPage = new PDPage(sourcePage.getMediaBox());
newDocument.addPage(newPage);
try (PDPageContentStream contentStream =
new PDPageContentStream(
newDocument, newPage, AppendMode.OVERWRITE, true, true)) {
// Import the source page as a form XObject
PDFormXObject formXObject =
layerUtility.importPageAsForm(sourceDocument, i);
for (int i = 0; i < totalPages; i++) {
PDPage sourcePage = sourceDocument.getPage(i);
contentStream.saveGraphicsState();
// Create a new page with the size of the source page
PDPage newPage = new PDPage(sourcePage.getMediaBox());
newDocument.addPage(newPage);
PDPageContentStream contentStream =
new PDPageContentStream(newDocument, newPage, AppendMode.OVERWRITE, true, true);
// Define the crop area
contentStream.addRect(
request.getX(),
request.getY(),
request.getWidth(),
request.getHeight());
contentStream.clip();
// Import the source page as a form XObject
PDFormXObject formXObject = layerUtility.importPageAsForm(sourceDocument, i);
// Draw the entire formXObject
contentStream.drawForm(formXObject);
contentStream.saveGraphicsState();
contentStream.restoreGraphicsState();
}
// Define the crop area
contentStream.addRect(
request.getX(), request.getY(), request.getWidth(), request.getHeight());
contentStream.clip();
// Now, set the new page's media box to the cropped size
newPage.setMediaBox(
new PDRectangle(
request.getX(),
request.getY(),
request.getWidth(),
request.getHeight()));
}
// Draw the entire formXObject
contentStream.drawForm(formXObject);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
newDocument.save(baos);
contentStream.restoreGraphicsState();
contentStream.close();
// Now, set the new page's media box to the cropped size
newPage.setMediaBox(
new PDRectangle(
request.getX(),
request.getY(),
request.getWidth(),
request.getHeight()));
byte[] pdfContent = baos.toByteArray();
return WebResponseUtils.bytesToWebResponse(
pdfContent,
GeneralUtils.generateFilename(
request.getFileInput().getOriginalFilename(), "_cropped.pdf"));
}
}
ByteArrayOutputStream baos = new ByteArrayOutputStream();
newDocument.save(baos);
newDocument.close();
sourceDocument.close();
byte[] pdfContent = baos.toByteArray();
return WebResponseUtils.bytesToWebResponse(
pdfContent,
GeneralUtils.generateFilename(
request.getFileInput().getOriginalFilename(), "_cropped.pdf"));
}
private ResponseEntity<byte[]> cropWithGhostscript(@ModelAttribute CropPdfForm request)
throws IOException {
PDDocument sourceDocument = pdfDocumentFactory.load(request);
Path tempInputFile = null;
Path tempOutputFile = null;
for (int i = 0; i < sourceDocument.getNumberOfPages(); i++) {
PDPage page = sourceDocument.getPage(i);
PDRectangle cropBox =
new PDRectangle(
request.getX(),
request.getY(),
request.getWidth(),
request.getHeight());
page.setCropBox(cropBox);
}
try (PDDocument sourceDocument = pdfDocumentFactory.load(request)) {
for (int i = 0; i < sourceDocument.getNumberOfPages(); i++) {
PDPage page = sourceDocument.getPage(i);
PDRectangle cropBox =
new PDRectangle(
request.getX(),
request.getY(),
request.getWidth(),
request.getHeight());
page.setCropBox(cropBox);
}
Path tempInputFile = Files.createTempFile("crop_input", ".pdf");
Path tempOutputFile = Files.createTempFile("crop_output", ".pdf");
tempInputFile = Files.createTempFile(TEMP_INPUT_PREFIX, PDF_EXTENSION);
tempOutputFile = Files.createTempFile(TEMP_OUTPUT_PREFIX, PDF_EXTENSION);
try {
// Save the source document with crop boxes
sourceDocument.save(tempInputFile.toFile());
sourceDocument.close();
// Execute Ghostscript to process the crop boxes
ProcessExecutor processExecutor =
ProcessExecutor.getInstance(ProcessExecutor.Processes.GHOSTSCRIPT);
List<String> command =
@ -152,19 +303,34 @@ public class CropController {
return WebResponseUtils.bytesToWebResponse(
pdfContent,
request.getFileInput().getOriginalFilename().replaceFirst("[.][^.]+$", "")
+ "_cropped.pdf");
GeneralUtils.generateFilename(
request.getFileInput().getOriginalFilename(), "_cropped.pdf"));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new IOException("Ghostscript processing was interrupted", e);
} finally {
try {
if (tempInputFile != null) {
Files.deleteIfExists(tempInputFile);
}
if (tempOutputFile != null) {
Files.deleteIfExists(tempOutputFile);
} catch (IOException e) {
log.debug("Failed to delete temporary files", e);
}
}
}
private record CropBounds(float x, float y, float width, float height) {
static CropBounds fromPixels(int[] pixelBounds, float scaleX, float scaleY) {
if (pixelBounds.length != 4) {
throw new IllegalArgumentException(
"pixelBounds array must contain exactly 4 elements: [x1, y1, x2, y2]");
}
float x = pixelBounds[0] * scaleX;
float y = pixelBounds[1] * scaleY;
float width = (pixelBounds[2] - pixelBounds[0]) * scaleX;
float height = (pixelBounds[3] - pixelBounds[1]) * scaleY;
return new CropBounds(x, y, width, height);
}
}
}

View File

@ -14,21 +14,24 @@ public class CropPdfForm extends PDFFile {
@Schema(
description = "The x-coordinate of the top-left corner of the crop area",
type = "number")
private float x;
private Float x;
@Schema(
description = "The y-coordinate of the top-left corner of the crop area",
type = "number")
private float y;
private Float y;
@Schema(description = "The width of the crop area", type = "number")
private float width;
private Float width;
@Schema(description = "The height of the crop area", type = "number")
private float height;
private Float height;
@Schema(
description = "Whether to remove text outside the crop area (keeps images)",
type = "boolean")
private boolean removeDataOutsideCrop = true;
@Schema(description = "Enable auto-crop to detect and remove white space", type = "boolean")
private boolean autoCrop = false;
}

View File

@ -1143,7 +1143,7 @@ adjustContrast.download=Download
crop.title=Crop
crop.header=Crop PDF
crop.submit=Submit
crop.autoCrop=Auto-crop (detect and remove white space)
#autoSplitPDF
autoSplitPDF.title=Auto Split PDF

View File

@ -2,7 +2,6 @@ let pdfCanvas = document.getElementById('cropPdfCanvas');
let overlayCanvas = document.getElementById('overlayCanvas');
let canvasesContainer = document.getElementById('canvasesContainer');
canvasesContainer.style.display = 'none';
let containerRect = canvasesContainer.getBoundingClientRect();
let context = pdfCanvas.getContext('2d');
let overlayContext = overlayCanvas.getContext('2d');
@ -30,9 +29,16 @@ let rectHeight = 0;
let pageScale = 1; // The scale which the pdf page renders
let timeId = null; // timeout id for resizing canvases event
let currentRenderTask = null; // Track current PDF render task to cancel if needed
function renderPageFromFile(file) {
if (file.type === 'application/pdf') {
// Cancel any ongoing render task when loading a new file
if (currentRenderTask) {
currentRenderTask.cancel();
currentRenderTask = null;
}
let reader = new FileReader();
reader.onload = function (ev) {
let typedArray = new Uint8Array(reader.result);
@ -51,7 +57,7 @@ window.addEventListener('resize', function () {
clearTimeout(timeId);
timeId = setTimeout(function () {
if (fileInput.files.length == 0) return;
if (!pdfDoc) return; // Only resize if we have a PDF loaded
let canvasesContainer = document.getElementById('canvasesContainer');
let containerRect = canvasesContainer.getBoundingClientRect();
@ -59,35 +65,33 @@ window.addEventListener('resize', function () {
overlayContext.clearRect(0, 0, overlayCanvas.width, overlayCanvas.height);
pdfCanvas.width = containerRect.width;
pdfCanvas.height = containerRect.height;
overlayCanvas.width = containerRect.width;
overlayCanvas.height = containerRect.height;
let file = fileInput.files[0];
renderPageFromFile(file);
// Re-render with new container size
renderPage(currentPage);
}, 1000);
});
fileInput.addEventListener('change', function (e) {
fileInput.addEventListener('file-input-change', async (e) => {
const {allFiles} = e.detail;
if (allFiles && allFiles.length > 0) {
canvasesContainer.style.display = 'block'; // set for visual purposes
fileInput.addEventListener('file-input-change', async (e) => {
if (!e.detail) return; // Guard against null detail
const {allFiles} = e.detail;
if (allFiles && allFiles.length > 0) {
canvasesContainer.style.display = 'block'; // set for visual purposes
// Wait for the layout to be updated before rendering
setTimeout(() => {
let file = allFiles[0];
renderPageFromFile(file);
}
});
}, 100);
}
});
cropForm.addEventListener('submit', function (e) {
if (xInput.value == '' && yInput.value == '' && widthInput.value == '' && heightInput.value == '') {
// Ορίστε συντεταγμένες για ολόκληρη την επιφάνεια του PDF
// Set coordinates for the entire PDF surface
let currentContainerRect = canvasesContainer.getBoundingClientRect();
xInput.value = 0;
yInput.value = 0;
widthInput.value = containerRect.width;
heightInput.value = containerRect.height;
widthInput.value = currentContainerRect.width;
heightInput.value = currentContainerRect.height;
}
});
@ -135,16 +139,24 @@ overlayCanvas.addEventListener('mouseup', function (e) {
});
function renderPage(pageNumber) {
// Cancel any ongoing render task
if (currentRenderTask) {
currentRenderTask.cancel();
currentRenderTask = null;
}
pdfDoc.getPage(pageNumber).then(function (page) {
let canvasesContainer = document.getElementById('canvasesContainer');
let containerRect = canvasesContainer.getBoundingClientRect();
pageScale = containerRect.width / page.getViewport({scale: 1}).width; // The new scale
let viewport = page.getViewport({scale: containerRect.width / page.getViewport({scale: 1}).width});
// Normalize rotation to 0, 90, 180, or 270 degrees
let normalizedRotation = ((page.rotate % 360) + 360) % 360;
let viewport = page.getViewport({scale: pageScale, rotation: normalizedRotation});
canvasesContainer.width = viewport.width;
canvasesContainer.height = viewport.height;
// Don't set container width, let CSS handle it
canvasesContainer.style.height = viewport.height + 'px';
pdfCanvas.width = viewport.width;
pdfCanvas.height = viewport.height;
@ -152,8 +164,21 @@ function renderPage(pageNumber) {
overlayCanvas.width = viewport.width; // Match overlay canvas size with PDF canvas
overlayCanvas.height = viewport.height;
context.clearRect(0, 0, pdfCanvas.width, pdfCanvas.height);
context.fillStyle = 'white';
context.fillRect(0, 0, pdfCanvas.width, pdfCanvas.height);
let renderContext = {canvasContext: context, viewport: viewport};
page.render(renderContext);
pdfCanvas.classList.add('shadow-canvas');
currentRenderTask = page.render(renderContext);
currentRenderTask.promise.then(function() {
currentRenderTask = null;
pdfCanvas.classList.add('shadow-canvas');
}).catch(function(error) {
if (error.name !== 'RenderingCancelledException') {
console.error('PDF rendering error:', error);
}
currentRenderTask = null;
});
});
}

View File

@ -23,6 +23,12 @@
<input id="width" type="hidden" name="width">
<input id="height" type="hidden" name="height">
<div class="form-check mb-3">
<input id="autoCrop" name="autoCrop" type="checkbox">
<label for="autoCrop" th:text="#{crop.autoCrop}"></label>
<input name="autoCrop" type="hidden" value="false" />
</div>
<button type="submit" id="submitBtn" class="btn btn-primary" th:text="#{crop.submit}"></button>
</form>
<div id="canvasesContainer" style="position: relative; margin: 20px 0; width: auto;">

View File

@ -0,0 +1,686 @@
package stirling.software.SPDF.controller.api;
import static org.assertj.core.api.Assertions.*;
import static org.mockito.Mockito.*;
import java.awt.image.BufferedImage;
import java.io.IOException;
import java.lang.reflect.Method;
import java.nio.file.Files;
import java.nio.file.Path;
import org.apache.pdfbox.Loader;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.font.Standard14Fonts;
import org.junit.jupiter.api.*;
import org.junit.jupiter.api.extension.ExtendWith;
import org.junit.jupiter.api.io.TempDir;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.CsvSource;
import org.junit.jupiter.params.provider.ValueSource;
import org.mockito.InjectMocks;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.mock.web.MockMultipartFile;
import stirling.software.SPDF.model.api.general.CropPdfForm;
import stirling.software.common.service.CustomPDFDocumentFactory;
@ExtendWith(MockitoExtension.class)
@DisplayName("CropController Tests")
class CropControllerTest {
@TempDir Path tempDir;
@Mock private CustomPDFDocumentFactory pdfDocumentFactory;
@InjectMocks private CropController cropController;
private TestPdfFactory pdfFactory;
@BeforeEach
void setUp() {
pdfFactory = new TestPdfFactory();
}
private static class CropRequestBuilder {
private final CropPdfForm form = new CropPdfForm();
CropRequestBuilder withFile(MockMultipartFile file) {
form.setFileInput(file);
return this;
}
CropRequestBuilder withCoordinates(float x, float y, float width, float height) {
form.setX(x);
form.setY(y);
form.setWidth(width);
form.setHeight(height);
return this;
}
CropRequestBuilder withAutoCrop(boolean autoCrop) {
form.setAutoCrop(autoCrop);
return this;
}
CropRequestBuilder withRemoveDataOutsideCrop(boolean remove) {
form.setRemoveDataOutsideCrop(remove);
return this;
}
CropPdfForm build() {
return form;
}
}
private class TestPdfFactory {
private static final PDType1Font HELVETICA =
new PDType1Font(Standard14Fonts.FontName.HELVETICA);
MockMultipartFile createStandardPdf(String filename) throws IOException {
return createPdf(filename, PDRectangle.LETTER, null);
}
MockMultipartFile createPdfWithContent(String filename, String content) throws IOException {
return createPdf(filename, PDRectangle.LETTER, content);
}
MockMultipartFile createPdfWithSize(String filename, PDRectangle size) throws IOException {
return createPdf(filename, size, null);
}
MockMultipartFile createPdf(String filename, PDRectangle pageSize, String content)
throws IOException {
Path testPdfPath = tempDir.resolve(filename);
try (PDDocument doc = new PDDocument()) {
PDPage page = new PDPage(pageSize);
doc.addPage(page);
if (content != null && !content.isEmpty()) {
try (PDPageContentStream contentStream = new PDPageContentStream(doc, page)) {
contentStream.beginText();
contentStream.setFont(HELVETICA, 12);
contentStream.newLineAtOffset(50, pageSize.getHeight() - 50);
contentStream.showText(content);
contentStream.endText();
}
}
doc.save(testPdfPath.toFile());
}
return new MockMultipartFile(
"fileInput",
filename,
MediaType.APPLICATION_PDF_VALUE,
Files.readAllBytes(testPdfPath));
}
MockMultipartFile createPdfWithCenteredContent(String filename, String content)
throws IOException {
Path testPdfPath = tempDir.resolve(filename);
PDRectangle pageSize = PDRectangle.LETTER;
try (PDDocument doc = new PDDocument()) {
PDPage page = new PDPage(pageSize);
doc.addPage(page);
if (content != null && !content.isEmpty()) {
try (PDPageContentStream contentStream = new PDPageContentStream(doc, page)) {
contentStream.beginText();
contentStream.setFont(HELVETICA, 12);
float x = pageSize.getWidth() / 2 - 50;
float y = pageSize.getHeight() / 2;
contentStream.newLineAtOffset(x, y);
contentStream.showText(content);
contentStream.endText();
}
}
doc.save(testPdfPath.toFile());
}
return new MockMultipartFile(
"fileInput",
filename,
MediaType.APPLICATION_PDF_VALUE,
Files.readAllBytes(testPdfPath));
}
}
@Nested
@DisplayName("Manual Crop with PDFBox")
class ManualCropPDFBoxTests {
@Test
@DisplayName(
"Should successfully crop PDF using PDFBox when removeDataOutsideCrop is false")
void shouldCropPdfSuccessfullyWithPDFBox() throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(50f, 50f, 512f, 692f)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response)
.isNotNull()
.extracting(ResponseEntity::getStatusCode, ResponseEntity::getBody)
.satisfies(
tuple -> {
assertThat(tuple.get(0)).isEqualTo(HttpStatus.OK);
assertThat(tuple.get(1)).isNotNull();
});
verify(pdfDocumentFactory).load(request);
verify(pdfDocumentFactory).createNewDocumentBasedOnOldDocument(mockDocument);
verify(mockDocument, times(1)).close();
verify(newDocument, times(1)).close();
}
@ParameterizedTest
@CsvSource({"50, 50, 512, 692", "0, 0, 300, 400", "100, 100, 400, 600"})
@DisplayName("Should handle various coordinate sets correctly")
void shouldHandleVariousCoordinates(float x, float y, float width, float height)
throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(x, y, width, height)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response).isNotNull();
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(response.getBody()).isNotNull();
verify(pdfDocumentFactory).load(request);
verify(mockDocument, times(1)).close();
verify(newDocument, times(1)).close();
}
}
@Nested
@DisplayName("Auto Crop Functionality")
@Tag("integration")
class AutoCropTests {
private TestPdfFactory autoCropPdfFactory;
@BeforeEach
void setUp() {
autoCropPdfFactory = new TestPdfFactory();
}
@Test
@DisplayName("Should auto-crop PDF with content successfully")
void shouldAutoCropPdfSuccessfully() throws IOException {
MockMultipartFile testFile =
autoCropPdfFactory.createPdfWithCenteredContent(
"test_autocrop.pdf", "Test Content for Auto Crop");
CropPdfForm request =
new CropRequestBuilder().withFile(testFile).withAutoCrop(true).build();
// Mock the pdfDocumentFactory to load real PDFs
try (PDDocument sourceDoc = Loader.loadPDF(testFile.getBytes());
PDDocument newDoc = new PDDocument()) {
when(pdfDocumentFactory.load(request)).thenReturn(sourceDoc);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(sourceDoc))
.thenReturn(newDoc);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response).isNotNull();
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(response.getBody()).isNotEmpty();
try (PDDocument result = Loader.loadPDF(response.getBody())) {
assertThat(result.getNumberOfPages()).isEqualTo(1);
PDPage page = result.getPage(0);
assertThat(page).isNotNull();
assertThat(page.getMediaBox()).isNotNull();
}
}
}
@Test
@DisplayName("Should handle PDF with minimal content")
void shouldHandleMinimalContentPdf() throws IOException {
MockMultipartFile testFile =
autoCropPdfFactory.createPdfWithContent("minimal.pdf", "X");
CropPdfForm request =
new CropRequestBuilder().withFile(testFile).withAutoCrop(true).build();
// Mock the pdfDocumentFactory to load real PDFs
try (PDDocument sourceDoc = Loader.loadPDF(testFile.getBytes());
PDDocument newDoc = new PDDocument()) {
when(pdfDocumentFactory.load(request)).thenReturn(sourceDoc);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(sourceDoc))
.thenReturn(newDoc);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response).isNotNull();
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
Assertions.assertNotNull(response.getBody());
try (PDDocument result = Loader.loadPDF(response.getBody())) {
assertThat(result.getNumberOfPages()).isEqualTo(1);
}
}
}
}
@Nested
@DisplayName("Content Bounds Detection")
class ContentBoundsDetectionTests {
private Method detectContentBoundsMethod;
private static BufferedImage createWhiteImage(int width, int height) {
BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
image.setRGB(x, y, 0xFFFFFF);
}
}
return image;
}
private static BufferedImage createImageFilledWith(int width, int height, int color) {
BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
image.setRGB(x, y, color);
}
}
return image;
}
private static void drawBlackRectangle(
BufferedImage image, int x1, int y1, int x2, int y2) {
for (int x = x1; x < x2; x++) {
for (int y = y1; y < y2; y++) {
image.setRGB(x, y, 0x000000);
}
}
}
private static void drawDarkerRectangle(
BufferedImage image, int x1, int y1, int x2, int y2, int color) {
for (int x = x1; x < x2; x++) {
for (int y = y1; y < y2; y++) {
image.setRGB(x, y, color);
}
}
}
@BeforeEach
void setUp() throws NoSuchMethodException {
detectContentBoundsMethod =
CropController.class.getDeclaredMethod(
"detectContentBounds", BufferedImage.class);
detectContentBoundsMethod.setAccessible(true);
}
@Test
@DisplayName("Should detect full image bounds for all white image")
void shouldDetectFullBoundsForWhiteImage() throws Exception {
BufferedImage whiteImage = createWhiteImage(100, 100);
int[] bounds = (int[]) detectContentBoundsMethod.invoke(null, whiteImage);
assertThat(bounds).containsExactly(0, 0, 99, 99);
}
@Test
@DisplayName("Should detect black rectangle bounds correctly")
void shouldDetectBlackRectangleBounds() throws Exception {
BufferedImage image = createWhiteImage(100, 100);
drawBlackRectangle(image, 25, 25, 75, 75);
int[] bounds = (int[]) detectContentBoundsMethod.invoke(null, image);
assertThat(bounds).containsExactly(25, 25, 74, 74);
}
@Test
@DisplayName("Should detect content at image edges")
void shouldDetectContentAtEdges() throws Exception {
BufferedImage image = createWhiteImage(100, 100);
image.setRGB(0, 0, 0x000000);
image.setRGB(99, 0, 0x000000);
image.setRGB(0, 99, 0x000000);
image.setRGB(99, 99, 0x000000);
int[] bounds = (int[]) detectContentBoundsMethod.invoke(null, image);
assertThat(bounds).containsExactly(0, 0, 99, 99);
}
@Test
@DisplayName("Should include noise pixels in bounds detection")
void shouldIncludeNoiseInBounds() throws Exception {
BufferedImage image = createWhiteImage(100, 100);
image.setRGB(10, 10, 0xF0F0F0);
image.setRGB(90, 90, 0xF0F0F0);
drawBlackRectangle(image, 30, 30, 70, 70);
int[] bounds = (int[]) detectContentBoundsMethod.invoke(null, image);
assertThat(bounds).containsExactly(10, 9, 90, 89);
}
@Test
@DisplayName("Should treat gray pixels below threshold as content")
void shouldTreatGrayPixelsAsContent() throws Exception {
BufferedImage image = createImageFilledWith(50, 50, 0xF0F0F0);
drawDarkerRectangle(image, 20, 20, 30, 30, 0xC0C0C0);
int[] bounds = (int[]) detectContentBoundsMethod.invoke(null, image);
assertThat(bounds).containsExactly(0, 0, 49, 49);
}
}
@Nested
@DisplayName("White Pixel Detection")
class WhitePixelDetectionTests {
private Method isWhiteMethod;
@BeforeEach
void setUp() throws NoSuchMethodException {
isWhiteMethod = CropController.class.getDeclaredMethod("isWhite", int.class, int.class);
isWhiteMethod.setAccessible(true);
}
@Test
@DisplayName("Should identify pure white pixels")
void shouldIdentifyWhitePixels() throws Exception {
assertThat((Boolean) isWhiteMethod.invoke(null, 0xFFFFFFFF, 250)).isTrue();
assertThat((Boolean) isWhiteMethod.invoke(null, 0xFFF0F0F0, 240)).isTrue();
}
@Test
@DisplayName("Should identify black pixels as non-white")
void shouldIdentifyBlackPixels() throws Exception {
assertThat((Boolean) isWhiteMethod.invoke(null, 0xFF000000, 250)).isFalse();
assertThat((Boolean) isWhiteMethod.invoke(null, 0xFF101010, 250)).isFalse();
}
@ParameterizedTest
@ValueSource(ints = {0xFFFFFFFF, 0xFFFAFAFA, 0xFFF5F5F5})
@DisplayName("Should identify various white shades")
void shouldIdentifyVariousWhiteShades(int pixelColor) throws Exception {
assertThat((Boolean) isWhiteMethod.invoke(null, pixelColor, 240)).isTrue();
}
@ParameterizedTest
@ValueSource(ints = {0xFF000000, 0xFF101010, 0xFF808080})
@DisplayName("Should identify various non-white shades")
void shouldIdentifyNonWhiteShades(int pixelColor) throws Exception {
assertThat((Boolean) isWhiteMethod.invoke(null, pixelColor, 250)).isFalse();
}
}
@Nested
@DisplayName("CropBounds Conversion")
class CropBoundsTests {
private Class<?> cropBoundsClass;
private Method fromPixelsMethod;
@BeforeEach
void setUp() throws ClassNotFoundException, NoSuchMethodException {
cropBoundsClass =
Class.forName(
"stirling.software.SPDF.controller.api.CropController$CropBounds");
fromPixelsMethod =
cropBoundsClass.getDeclaredMethod(
"fromPixels", int[].class, float.class, float.class);
fromPixelsMethod.setAccessible(true);
}
@Test
@DisplayName("Should convert pixel bounds to PDF coordinates correctly")
void shouldConvertPixelBoundsToPdfCoordinates() throws Exception {
int[] pixelBounds = {10, 20, 110, 120};
float scaleX = 0.5f;
float scaleY = 0.5f;
Object bounds = fromPixelsMethod.invoke(null, pixelBounds, scaleX, scaleY);
assertThat(getFloatField(bounds, "x")).isCloseTo(5.0f, within(0.01f));
assertThat(getFloatField(bounds, "y")).isCloseTo(10.0f, within(0.01f));
assertThat(getFloatField(bounds, "width")).isCloseTo(50.0f, within(0.01f));
assertThat(getFloatField(bounds, "height")).isCloseTo(50.0f, within(0.01f));
}
@ParameterizedTest
@CsvSource({
"0, 0, 100, 100, 1.0, 1.0",
"10, 20, 50, 80, 2.0, 2.0",
"5, 5, 25, 25, 0.5, 0.5"
})
@DisplayName("Should handle various scale factors")
void shouldHandleVariousScaleFactors(
int x1, int y1, int x2, int y2, float scaleX, float scaleY) throws Exception {
int[] pixelBounds = {x1, y1, x2, y2};
Object bounds = fromPixelsMethod.invoke(null, pixelBounds, scaleX, scaleY);
assertThat(bounds).isNotNull();
assertThat(getFloatField(bounds, "width")).isGreaterThan(0);
assertThat(getFloatField(bounds, "height")).isGreaterThan(0);
}
@Test
@DisplayName("Should throw exception for invalid pixel bounds array")
void shouldThrowExceptionForInvalidArray() {
int[] invalidBounds = {10, 20, 30};
assertThatThrownBy(() -> fromPixelsMethod.invoke(null, invalidBounds, 1.0f, 1.0f))
.isInstanceOf(Exception.class)
.hasCauseInstanceOf(IllegalArgumentException.class)
.cause()
.hasMessageContaining("pixelBounds array must contain exactly 4 elements");
}
private float getFloatField(Object obj, String fieldName) throws Exception {
Method getter = cropBoundsClass.getDeclaredMethod(fieldName);
return (Float) getter.invoke(obj);
}
}
@Nested
@DisplayName("Error Handling")
class ErrorHandlingTests {
@Test
@DisplayName("Should throw exception for corrupt PDF file")
void shouldThrowExceptionForCorruptPdf() throws IOException {
MockMultipartFile corruptFile =
new MockMultipartFile(
"fileInput",
"corrupt.pdf",
MediaType.APPLICATION_PDF_VALUE,
"not a valid pdf content".getBytes());
CropPdfForm request =
new CropRequestBuilder()
.withFile(corruptFile)
.withCoordinates(50f, 50f, 512f, 692f)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
when(pdfDocumentFactory.load(request)).thenThrow(new IOException("Invalid PDF format"));
assertThatThrownBy(() -> cropController.cropPdf(request))
.isInstanceOf(IOException.class)
.hasMessageContaining("Invalid PDF format");
verify(pdfDocumentFactory).load(request);
}
@Test
@DisplayName("Should throw exception when coordinates are missing for manual crop")
void shouldThrowExceptionForMissingCoordinates() throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
CropPdfForm request =
new CropRequestBuilder().withFile(testFile).withAutoCrop(false).build();
assertThatThrownBy(() -> cropController.cropPdf(request))
.isInstanceOf(IllegalArgumentException.class)
.hasMessage(
"Crop coordinates (x, y, width, height) are required when auto-crop is not enabled");
}
@Test
@DisplayName("Should handle negative coordinates gracefully")
void shouldHandleNegativeCoordinates() throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(-10f, 50f, 512f, 692f)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
assertThatCode(() -> cropController.cropPdf(request)).doesNotThrowAnyException();
verify(mockDocument, times(1)).close();
verify(newDocument, times(1)).close();
}
@Test
@DisplayName("Should handle zero width or height")
void shouldHandleZeroDimensions() throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(50f, 50f, 0f, 692f)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
assertThatCode(() -> cropController.cropPdf(request)).doesNotThrowAnyException();
verify(mockDocument, times(1)).close();
verify(newDocument, times(1)).close();
}
}
@Nested
@DisplayName("PDF Content Verification")
@Tag("integration")
class PdfContentVerificationTests {
private static PDRectangle getPageSize(String name) {
return switch (name) {
case "LETTER" -> PDRectangle.LETTER;
case "A4" -> PDRectangle.A4;
case "LEGAL" -> PDRectangle.LEGAL;
default -> PDRectangle.LETTER;
};
}
@Test
@DisplayName("Should produce PDF with correct dimensions after crop")
void shouldProducePdfWithCorrectDimensions() throws IOException {
MockMultipartFile testFile = pdfFactory.createStandardPdf("test.pdf");
float expectedWidth = 400f;
float expectedHeight = 500f;
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(50f, 50f, expectedWidth, expectedHeight)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response).isNotNull();
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
}
@ParameterizedTest
@CsvSource({"test1.pdf, LETTER", "test2.pdf, A4", "test3.pdf, LEGAL"})
@DisplayName("Should handle different page sizes")
void shouldHandleDifferentPageSizes(String filename, String pageSizeName)
throws IOException {
PDRectangle pageSize = getPageSize(pageSizeName);
MockMultipartFile testFile = pdfFactory.createPdfWithSize(filename, pageSize);
CropPdfForm request =
new CropRequestBuilder()
.withFile(testFile)
.withCoordinates(50f, 50f, 300f, 400f)
.withRemoveDataOutsideCrop(false)
.withAutoCrop(false)
.build();
PDDocument mockDocument = mock(PDDocument.class);
PDDocument newDocument = mock(PDDocument.class);
when(pdfDocumentFactory.load(request)).thenReturn(mockDocument);
when(pdfDocumentFactory.createNewDocumentBasedOnOldDocument(mockDocument))
.thenReturn(newDocument);
ResponseEntity<byte[]> response = cropController.cropPdf(request);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
verify(mockDocument, times(1)).close();
verify(newDocument, times(1)).close();
}
}
}