mirror of
https://github.com/Frooodle/Stirling-PDF.git
synced 2025-11-16 01:21:16 +01:00
docs update
This commit is contained in:
parent
a7f17947df
commit
a1b69b4b8b
@ -5216,7 +5216,6 @@ public class PdfJsonConversionService {
|
|||||||
* Stores PDF bytes for lazy page loading. Each page is extracted on-demand by re-loading the
|
* Stores PDF bytes for lazy page loading. Each page is extracted on-demand by re-loading the
|
||||||
* PDF from bytes.
|
* PDF from bytes.
|
||||||
*/
|
*/
|
||||||
@lombok.Data
|
|
||||||
private static class CachedPdfDocument {
|
private static class CachedPdfDocument {
|
||||||
private final byte[] pdfBytes;
|
private final byte[] pdfBytes;
|
||||||
private final PdfJsonDocumentMetadata metadata;
|
private final PdfJsonDocumentMetadata metadata;
|
||||||
@ -5243,6 +5242,27 @@ public class PdfJsonConversionService {
|
|||||||
this.timestamp = System.currentTimeMillis();
|
this.timestamp = System.currentTimeMillis();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Getters return defensive copies to prevent external mutation
|
||||||
|
public byte[] getPdfBytes() {
|
||||||
|
return pdfBytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
public PdfJsonDocumentMetadata getMetadata() {
|
||||||
|
return metadata;
|
||||||
|
}
|
||||||
|
|
||||||
|
public Map<String, PdfJsonFont> getFonts() {
|
||||||
|
return new java.util.concurrent.ConcurrentHashMap<>(fonts);
|
||||||
|
}
|
||||||
|
|
||||||
|
public Map<Integer, Map<PDFont, String>> getPageFontResources() {
|
||||||
|
return new java.util.concurrent.ConcurrentHashMap<>(pageFontResources);
|
||||||
|
}
|
||||||
|
|
||||||
|
public long getTimestamp() {
|
||||||
|
return timestamp;
|
||||||
|
}
|
||||||
|
|
||||||
public CachedPdfDocument withUpdatedPdfBytes(byte[] nextBytes) {
|
public CachedPdfDocument withUpdatedPdfBytes(byte[] nextBytes) {
|
||||||
return withUpdatedFonts(nextBytes, this.fonts);
|
return withUpdatedFonts(nextBytes, this.fonts);
|
||||||
}
|
}
|
||||||
|
|||||||
@ -39,3 +39,18 @@
|
|||||||
- `applyWeightStyle()`: Applies appropriate suffix (handles both "italic" and "oblique" naming)
|
- `applyWeightStyle()`: Applies appropriate suffix (handles both "italic" and "oblique" naming)
|
||||||
- All fonts consolidated from Type3 library into main fonts directory for unified fallback support
|
- All fonts consolidated from Type3 library into main fonts directory for unified fallback support
|
||||||
- Benefits: Comprehensive visual consistency when editing text in bold/italic fonts across many font families
|
- Benefits: Comprehensive visual consistency when editing text in bold/italic fonts across many font families
|
||||||
|
|
||||||
|
- **Font Text Color Support**
|
||||||
|
- Add support for reading and preserving text color information from PDF content streams
|
||||||
|
- Enable color editing in the editor UI
|
||||||
|
- Ensure proper round-trip conversion maintains color fidelity
|
||||||
|
|
||||||
|
- **Space Character Handling**
|
||||||
|
- Improve handling of space characters as proper text elements
|
||||||
|
- Ensure spaces are correctly preserved during text extraction and reconstruction
|
||||||
|
- Fix any issues with space positioning and width calculations
|
||||||
|
|
||||||
|
- **Textbox Selection Enhancement**
|
||||||
|
- Improve textbox selection behavior in the editor
|
||||||
|
- Enhance user experience for selecting and manipulating text boxes
|
||||||
|
- Address any selection precision or interaction issues
|
||||||
|
|||||||
@ -10,243 +10,21 @@
|
|||||||
|
|
||||||
This analysis identifies **CRITICAL** security vulnerabilities, thread safety issues, and performance problems in the PDF JSON editor codebase. The service contains:
|
This analysis identifies **CRITICAL** security vulnerabilities, thread safety issues, and performance problems in the PDF JSON editor codebase. The service contains:
|
||||||
|
|
||||||
- **1 CRITICAL security vulnerability** (cache poisoning/information disclosure)
|
- **2 CRITICAL issues** requiring immediate attention
|
||||||
- **3 CRITICAL threading issues** causing data corruption
|
|
||||||
- **2 HIGH severity resource leaks** causing memory exhaustion
|
- **2 HIGH severity resource leaks** causing memory exhaustion
|
||||||
- **Multiple performance bottlenecks** limiting scalability
|
- **Multiple performance bottlenecks** limiting scalability
|
||||||
|
|
||||||
**Immediate Action Required:**
|
**Immediate Action Required:**
|
||||||
1. Fix user-supplied jobId security vulnerability (Issue #1)
|
1. Fix pageFontResources PDFont key issue (Issue #1)
|
||||||
2. Fix cache mutation race conditions (Issues #2, #3, #4)
|
2. Replace unbounded thread spawning (Issue #2)
|
||||||
3. Replace unbounded thread spawning (Issue #5)
|
3. Add cache size limits (Issue #3)
|
||||||
4. Add cache size limits (Issue #6)
|
4. Fix Type3 cache race condition (Issue #4)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## CRITICAL ISSUES
|
## CRITICAL ISSUES
|
||||||
|
|
||||||
### Issue #1: User-Supplied jobId - CRITICAL SECURITY VULNERABILITY ⚠️
|
### Issue #1: pageFontResources Keyed by PDFont Instances - CRITICAL
|
||||||
|
|
||||||
**Location:** `ConvertPdfJsonController.java:95, 120, 148, 160`
|
|
||||||
|
|
||||||
**Severity:** CRITICAL (Security Vulnerability)
|
|
||||||
|
|
||||||
**Type:** Cache Poisoning, Information Disclosure, Access Control Bypass
|
|
||||||
|
|
||||||
**Verified:** ✅ TRUE
|
|
||||||
|
|
||||||
**Description:**
|
|
||||||
```java
|
|
||||||
// Line 95
|
|
||||||
public ResponseEntity<byte[]> extractPdfMetadata(
|
|
||||||
@ModelAttribute PDFFile request,
|
|
||||||
@RequestParam(required = true) String jobId) // User-controlled!
|
|
||||||
|
|
||||||
// Lines 120, 148, 160
|
|
||||||
@PathVariable String jobId // User-controlled!
|
|
||||||
```
|
|
||||||
|
|
||||||
**Security Issues:**
|
|
||||||
1. **Information Disclosure:** Users can guess/enumerate jobIds to access other users' cached PDFs
|
|
||||||
2. **Cache Poisoning:** Users can overwrite other users' cache entries by supplying the same jobId
|
|
||||||
3. **No Authentication/Authorization:** No check to verify jobId ownership
|
|
||||||
4. **Predictable IDs:** If users choose sequential IDs ("job1", "job2"), enumeration is trivial
|
|
||||||
|
|
||||||
**Attack Scenarios:**
|
|
||||||
```
|
|
||||||
User A: Uploads sensitive PDF with jobId="company-financials"
|
|
||||||
User B: Calls /pdf/json/metadata/company-financials
|
|
||||||
-> Gets User A's PDF metadata
|
|
||||||
User B: Calls /pdf/json/page/company-financials/1
|
|
||||||
-> Gets User A's PDF page content
|
|
||||||
```
|
|
||||||
|
|
||||||
**Impact:**
|
|
||||||
- **Confidentiality Breach:** Unauthorized access to other users' documents
|
|
||||||
- **Data Integrity:** Users can corrupt each other's cached documents
|
|
||||||
- **Cache Collision:** Legitimate users get wrong data when IDs collide
|
|
||||||
|
|
||||||
**Recommendation:**
|
|
||||||
```java
|
|
||||||
// Server-generated UUIDs with user/session binding
|
|
||||||
@PostMapping("/pdf/json/metadata")
|
|
||||||
public ResponseEntity<byte[]> extractPdfMetadata(
|
|
||||||
@ModelAttribute PDFFile request,
|
|
||||||
HttpSession session) {
|
|
||||||
String jobId = UUID.randomUUID().toString();
|
|
||||||
String userId = session.getId(); // or authenticated user ID
|
|
||||||
|
|
||||||
// Store with composite key
|
|
||||||
String cacheKey = userId + ":" + jobId;
|
|
||||||
|
|
||||||
// Validate ownership on retrieval
|
|
||||||
if (!validateOwnership(cacheKey, session)) {
|
|
||||||
throw new AccessDeniedException("Invalid jobId");
|
|
||||||
}
|
|
||||||
|
|
||||||
return service.extractDocumentMetadata(request, cacheKey);
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Issue #2: Mutable Cache Maps - CRITICAL Race Condition
|
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:5009-5037, 4844-4852, 5170-5176`
|
|
||||||
|
|
||||||
**Severity:** CRITICAL (Data Corruption)
|
|
||||||
|
|
||||||
**Type:** Race Condition, Cache Mutation
|
|
||||||
|
|
||||||
**Verified:** ✅ TRUE
|
|
||||||
|
|
||||||
**Description:**
|
|
||||||
```java
|
|
||||||
// Line 5010-5037: CachedPdfDocument stores mutable maps
|
|
||||||
@lombok.Data
|
|
||||||
private static class CachedPdfDocument {
|
|
||||||
private final Map<String, PdfJsonFont> fonts; // Mutable reference!
|
|
||||||
private final Map<Integer, Map<PDFont, String>> pageFontResources; // Mutable!
|
|
||||||
}
|
|
||||||
|
|
||||||
// Line 5170-5176: Cached maps passed directly to stripper
|
|
||||||
TextCollectingStripper stripper = new TextCollectingStripper(
|
|
||||||
document,
|
|
||||||
cached.getFonts(), // ← Mutable map shared across threads!
|
|
||||||
textByPage,
|
|
||||||
cached.getPageFontResources(),
|
|
||||||
new IdentityHashMap<>());
|
|
||||||
|
|
||||||
// Line 4844-4852: registerFont() mutates the shared map
|
|
||||||
private String registerFont(PDFont font) throws IOException {
|
|
||||||
if (!fonts.containsKey(key)) {
|
|
||||||
fonts.put(key, buildFontModel(...)); // ← MUTATION!
|
|
||||||
}
|
|
||||||
return fontId;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Race Condition:**
|
|
||||||
```
|
|
||||||
Time | Thread A (extractSinglePage job1, page1) | Thread B (extractSinglePage job1, page2)
|
|
||||||
-----|-------------------------------------------|------------------------------------------
|
|
||||||
T1 | Gets cached.getFonts() -> Map@123 |
|
|
||||||
T2 | | Gets cached.getFonts() -> Map@123 (same!)
|
|
||||||
T3 | registerFont() checks containsKey("1:F1") |
|
|
||||||
T4 | | registerFont() checks containsKey("2:F1")
|
|
||||||
T5 | fonts.put("1:F1", fontModel) |
|
|
||||||
T6 | | fonts.put("2:F1", fontModel)
|
|
||||||
T7 | Iterator over fonts |
|
|
||||||
T8 | | Concurrent modification! ← EXCEPTION
|
|
||||||
```
|
|
||||||
|
|
||||||
**Impact:**
|
|
||||||
- `ConcurrentModificationException` thrown to users
|
|
||||||
- Font metadata corruption (partial writes visible)
|
|
||||||
- Cache inconsistency between requests
|
|
||||||
- Non-deterministic failures under load
|
|
||||||
|
|
||||||
**Evidence of Mutation:**
|
|
||||||
- `@lombok.Data` generates `getFonts()` that returns the same mutable reference
|
|
||||||
- No defensive copying when retrieving from cache
|
|
||||||
- Multiple threads call `extractSinglePage` with same jobId simultaneously
|
|
||||||
|
|
||||||
**Recommendation:**
|
|
||||||
```java
|
|
||||||
// Option 1: Deep copy on retrieval
|
|
||||||
private static class CachedPdfDocument {
|
|
||||||
public Map<String, PdfJsonFont> getFonts() {
|
|
||||||
return new LinkedHashMap<>(fonts); // Defensive copy
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Option 2: Immutable maps
|
|
||||||
private static class CachedPdfDocument {
|
|
||||||
private final ImmutableMap<String, PdfJsonFont> fonts;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Option 3: Synchronize mutations (less performant)
|
|
||||||
private synchronized String registerFont(PDFont font) { ... }
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Issue #3: Font UID Not Globally Unique - CRITICAL Cache Collision
|
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:715-722, 961, 147-148`
|
|
||||||
|
|
||||||
**Severity:** CRITICAL (Cache Corruption)
|
|
||||||
|
|
||||||
**Type:** Cache Key Collision, Memory Leak
|
|
||||||
|
|
||||||
**Verified:** ✅ TRUE
|
|
||||||
|
|
||||||
**Description:**
|
|
||||||
```java
|
|
||||||
// Line 715-722: UID construction
|
|
||||||
private String buildFontKey(int pageNumber, String fontId) {
|
|
||||||
return pageNumber + ":" + fontId; // Only unique within one document!
|
|
||||||
}
|
|
||||||
|
|
||||||
// Line 961: Used as cache key
|
|
||||||
.uid(buildFontKey(pageNumber, fontId))
|
|
||||||
|
|
||||||
// Line 147-148: Global singleton caches
|
|
||||||
private final Map<String, PDFont> type3NormalizedFontCache = new ConcurrentHashMap<>();
|
|
||||||
private final Map<String, Set<Integer>> type3GlyphCoverageCache = new ConcurrentHashMap<>();
|
|
||||||
```
|
|
||||||
|
|
||||||
**Problem:**
|
|
||||||
Font UIDs are only unique within a single document. The format is `"pageNumber:fontId"` (e.g., `"1:F1"`). When multiple jobs run concurrently:
|
|
||||||
|
|
||||||
```
|
|
||||||
Job A (invoice.pdf): Font UID = "1:F1" (Times-Roman)
|
|
||||||
Job B (report.pdf): Font UID = "1:F1" (Helvetica)
|
|
||||||
|
|
||||||
Both jobs share the SAME cache entry!
|
|
||||||
```
|
|
||||||
|
|
||||||
**Race Condition:**
|
|
||||||
```
|
|
||||||
T1: Job A converts PDF with font "1:F1" (Times-Roman)
|
|
||||||
T2: Job A caches PDFont@AAA in type3NormalizedFontCache["1:F1"]
|
|
||||||
T3: Job B converts PDF with font "1:F1" (Helvetica)
|
|
||||||
T4: Job B finds cache hit for "1:F1"
|
|
||||||
T5: Job B uses Times-Roman font (WRONG!) ← DATA CORRUPTION
|
|
||||||
```
|
|
||||||
|
|
||||||
**Memory Leak:**
|
|
||||||
```java
|
|
||||||
// PDFont objects hold references to their source PDDocument
|
|
||||||
PDFont cachedFont = ...; // Created from documentA
|
|
||||||
documentA.close(); // Document closed
|
|
||||||
// cachedFont still references freed native resources!
|
|
||||||
```
|
|
||||||
|
|
||||||
**Impact:**
|
|
||||||
- Wrong fonts used across different documents
|
|
||||||
- PDFBox native memory leaks (PDDocument held by cached PDFont)
|
|
||||||
- Cache grows unbounded (never cleared except during JSON→PDF)
|
|
||||||
- Type3GlyphCoverageCache **NEVER CLEARED** anywhere in codebase
|
|
||||||
|
|
||||||
**Recommendation:**
|
|
||||||
```java
|
|
||||||
// Include jobId in font UID
|
|
||||||
private String buildFontKey(String jobId, int pageNumber, String fontId) {
|
|
||||||
return jobId + ":" + pageNumber + ":" + fontId;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Or scope caches per-job
|
|
||||||
private static class JobContext {
|
|
||||||
private final String jobId;
|
|
||||||
private final Map<String, PDFont> type3FontCache = new HashMap<>();
|
|
||||||
private final Map<String, Set<Integer>> glyphCoverageCache = new HashMap<>();
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Issue #4: pageFontResources Keyed by PDFont Instances - CRITICAL
|
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:5075-5081, 5158-5159`
|
**Location:** `PdfJsonConversionService.java:5075-5081, 5158-5159`
|
||||||
|
|
||||||
@ -314,7 +92,7 @@ Map<Integer, Set<String>> pageFontUids = new HashMap<>();
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #5: Unbounded Thread Creation - CRITICAL Resource Leak
|
### Issue #2: Unbounded Thread Creation - CRITICAL Resource Leak
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:5550-5562`
|
**Location:** `PdfJsonConversionService.java:5550-5562`
|
||||||
|
|
||||||
@ -414,7 +192,7 @@ public class PdfJsonConversionService {
|
|||||||
|
|
||||||
## HIGH SEVERITY ISSUES
|
## HIGH SEVERITY ISSUES
|
||||||
|
|
||||||
### Issue #6: Unbounded Cache Growth - HIGH Memory Leak
|
### Issue #3: Unbounded Cache Growth - HIGH Memory Leak
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:147-148, 154`
|
**Location:** `PdfJsonConversionService.java:147-148, 154`
|
||||||
|
|
||||||
@ -502,7 +280,7 @@ private final Cache<String, CachedPdfDocument> documentCache =
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #7: Type3 Cache Race Condition - HIGH
|
### Issue #4: Type3 Cache Race Condition - HIGH
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:3759-3773`
|
**Location:** `PdfJsonConversionService.java:3759-3773`
|
||||||
|
|
||||||
@ -584,7 +362,7 @@ private void loadNormalizedType3Font(...) throws IOException {
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #8: PDDocument Resource Lifecycle - NEEDS INVESTIGATION
|
### Issue #5: PDDocument Resource Lifecycle - NEEDS INVESTIGATION
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:3766, 5158`
|
**Location:** `PdfJsonConversionService.java:3766, 5158`
|
||||||
|
|
||||||
@ -662,7 +440,7 @@ private PDFont getCachedFont(String fontUid) {
|
|||||||
|
|
||||||
## MEDIUM SEVERITY ISSUES
|
## MEDIUM SEVERITY ISSUES
|
||||||
|
|
||||||
### Issue #9: Full PDF Reload Per Page - MEDIUM Performance
|
### Issue #6: Full PDF Reload Per Page - MEDIUM Performance
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:5158-5159`
|
**Location:** `PdfJsonConversionService.java:5158-5159`
|
||||||
|
|
||||||
@ -728,7 +506,7 @@ private static class CachedPdfDocument {
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #10: Large Base64 Operations - MEDIUM Performance
|
### Issue #7: Large Base64 Operations - MEDIUM Performance
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:1062, 1428, 3570, 3584, 3612, 3630`
|
**Location:** `PdfJsonConversionService.java:1062, 1428, 3570, 3584, 3612, 3630`
|
||||||
|
|
||||||
@ -788,7 +566,7 @@ CompletableFuture<String> encodeFuture = CompletableFuture.supplyAsync(
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #11: File I/O on Request Threads - MEDIUM
|
### Issue #8: File I/O on Request Threads - MEDIUM
|
||||||
|
|
||||||
**Location:** `PdfJsonConversionService.java:276, 405, 5066`
|
**Location:** `PdfJsonConversionService.java:276, 405, 5066`
|
||||||
|
|
||||||
@ -843,7 +621,7 @@ try (InputStream in = file.getInputStream();
|
|||||||
|
|
||||||
## LOW SEVERITY ISSUES
|
## LOW SEVERITY ISSUES
|
||||||
|
|
||||||
### Issue #12: PdfLazyLoadingService Unused - LOW
|
### Issue #9: PdfLazyLoadingService Unused - LOW
|
||||||
|
|
||||||
**Location:** `PdfLazyLoadingService.java` (entire file)
|
**Location:** `PdfLazyLoadingService.java` (entire file)
|
||||||
|
|
||||||
@ -874,7 +652,7 @@ try (InputStream in = file.getInputStream();
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Issue #13: PdfJsonFontService Volatile Fields - LOW
|
### Issue #10: PdfJsonFontService Volatile Fields - LOW
|
||||||
|
|
||||||
**Location:** `PdfJsonFontService.java:46-47`
|
**Location:** `PdfJsonFontService.java:46-47`
|
||||||
|
|
||||||
@ -947,7 +725,7 @@ private EncodedImage getOrEncodeImage(PDImage pdImage) {
|
|||||||
|
|
||||||
## ARCHITECTURE ISSUES
|
## ARCHITECTURE ISSUES
|
||||||
|
|
||||||
### Issue #14: Singleton Service Architecture - MEDIUM
|
### Issue #11: Singleton Service Architecture - MEDIUM
|
||||||
|
|
||||||
**Location:** All `@Service` and `@Component` classes
|
**Location:** All `@Service` and `@Component` classes
|
||||||
|
|
||||||
@ -988,63 +766,45 @@ public class PdfJsonConversionService {
|
|||||||
## SUMMARY BY SEVERITY
|
## SUMMARY BY SEVERITY
|
||||||
|
|
||||||
### CRITICAL (Fix Immediately)
|
### CRITICAL (Fix Immediately)
|
||||||
1. ✅ **User-supplied jobId** (Issue #1) - Security vulnerability
|
1. **pageFontResources PDFont keys** - Broken feature
|
||||||
2. ✅ **Mutable cache maps** (Issue #2) - Data corruption
|
2. **Unbounded thread creation** - Resource exhaustion
|
||||||
3. ✅ **Font UID collisions** (Issue #3) - Cache corruption
|
|
||||||
4. ✅ **pageFontResources PDFont keys** (Issue #4) - Broken feature
|
|
||||||
5. ✅ **Unbounded thread creation** (Issue #5) - Resource exhaustion
|
|
||||||
|
|
||||||
### HIGH (Fix Soon)
|
### HIGH (Fix Soon)
|
||||||
6. ✅ **Unbounded cache growth** (Issue #6) - Memory leak
|
3. **Unbounded cache growth** - Memory leak
|
||||||
7. ✅ **Type3 cache race** (Issue #7) - Duplicate work
|
4. **Type3 cache race** - Duplicate work
|
||||||
8. ⚠️ **PDDocument lifecycle** (Issue #8) - Needs investigation (speculative)
|
5. **PDDocument lifecycle** - Needs investigation (speculative)
|
||||||
|
|
||||||
### MEDIUM (Plan and Address)
|
### MEDIUM (Plan and Address)
|
||||||
9. ✅ **Full PDF reload per page** (Issue #9) - Performance
|
6. **Full PDF reload per page** - Performance
|
||||||
10. ✅ **Large Base64 operations** (Issue #10) - Performance
|
7. **Large Base64 operations** - Performance
|
||||||
11. ✅ **File I/O blocking** (Issue #11) - Performance
|
8. **File I/O blocking** - Performance
|
||||||
12. ✅ **PdfLazyLoadingService unused** (Issue #12) - Dead code
|
9. **PdfLazyLoadingService unused** - Dead code
|
||||||
13. ✅ **Singleton architecture** (Issue #14) - Maintainability
|
10. **Singleton architecture** - Maintainability
|
||||||
|
|
||||||
### LOW (Monitor)
|
### LOW (Monitor)
|
||||||
14. ✅ **PdfJsonFontService volatile** (Issue #13) - Correctly implemented
|
11. **PdfJsonFontService volatile** - Correctly implemented (no action needed)
|
||||||
|
|
||||||
### VERIFIED FALSE
|
### VERIFIED FALSE
|
||||||
15. ❌ file.getBytes() called twice
|
12. file.getBytes() called twice
|
||||||
16. ❌ Image Base64 encoding per call
|
13. Image Base64 encoding per call
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## IMPLEMENTATION ROADMAP
|
## IMPLEMENTATION ROADMAP
|
||||||
|
|
||||||
### Phase 1: Critical Security & Data Integrity (1-2 weeks)
|
### Phase 1: Critical Fixes (1-2 weeks)
|
||||||
|
|
||||||
**1. Fix jobId Security (Issue #1)**
|
**1. Fix pageFontResources PDFont keys (Issue #1)**
|
||||||
```java
|
|
||||||
// Priority: CRITICAL
|
|
||||||
// Time: 2 days
|
|
||||||
// Risk: Low (straightforward fix)
|
|
||||||
|
|
||||||
// Generate server-side UUIDs
|
|
||||||
// Add session/user binding
|
|
||||||
// Validate ownership on all cache operations
|
|
||||||
```
|
|
||||||
|
|
||||||
**2. Fix Cache Mutation (Issues #2, #3, #4)**
|
|
||||||
```java
|
```java
|
||||||
// Priority: CRITICAL
|
// Priority: CRITICAL
|
||||||
// Time: 3-5 days
|
// Time: 3-5 days
|
||||||
// Risk: Medium (requires careful testing)
|
// Risk: Medium (requires careful testing)
|
||||||
|
|
||||||
// Make CachedPdfDocument immutable
|
|
||||||
// Scope font caches per-job
|
|
||||||
// Replace PDFont keys with String resource names
|
// Replace PDFont keys with String resource names
|
||||||
// Add defensive copying
|
// Update cache lookup logic
|
||||||
```
|
```
|
||||||
|
|
||||||
### Phase 2: Resource Management (1 week)
|
**2. Fix Thread Leaks (Issue #2)**
|
||||||
|
|
||||||
**3. Fix Thread Leaks (Issue #5)**
|
|
||||||
```java
|
```java
|
||||||
// Priority: CRITICAL
|
// Priority: CRITICAL
|
||||||
// Time: 1 day
|
// Time: 1 day
|
||||||
@ -1055,7 +815,9 @@ public class PdfJsonConversionService {
|
|||||||
// Monitor thread counts
|
// Monitor thread counts
|
||||||
```
|
```
|
||||||
|
|
||||||
**4. Add Cache Eviction (Issue #6, previously also listed as Issue #11)**
|
### Phase 2: Resource Management (1 week)
|
||||||
|
|
||||||
|
**3. Add Cache Eviction (Issue #3)**
|
||||||
```java
|
```java
|
||||||
// Priority: HIGH
|
// Priority: HIGH
|
||||||
// Time: 3 days
|
// Time: 3 days
|
||||||
@ -1067,9 +829,18 @@ public class PdfJsonConversionService {
|
|||||||
// Monitor cache metrics
|
// Monitor cache metrics
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**4. Fix Type3 Cache Race (Issue #4)**
|
||||||
|
```java
|
||||||
|
// Priority: HIGH
|
||||||
|
// Time: 1-2 days
|
||||||
|
// Risk: Low (straightforward fix)
|
||||||
|
|
||||||
|
// Use computeIfAbsent for atomic operations
|
||||||
|
```
|
||||||
|
|
||||||
### Phase 3: Performance Optimization (2-3 weeks)
|
### Phase 3: Performance Optimization (2-3 weeks)
|
||||||
|
|
||||||
**5. Optimize Lazy Loading (Issue #9)**
|
**5. Optimize Lazy Loading (Issue #6)**
|
||||||
```java
|
```java
|
||||||
// Priority: MEDIUM
|
// Priority: MEDIUM
|
||||||
// Time: 1 week
|
// Time: 1 week
|
||||||
@ -1080,7 +851,7 @@ public class PdfJsonConversionService {
|
|||||||
// Or: Pre-split pages at upload
|
// Or: Pre-split pages at upload
|
||||||
```
|
```
|
||||||
|
|
||||||
**6. Async I/O (Issues #10, #12)**
|
**6. Async I/O (Issues #7, #8)**
|
||||||
```java
|
```java
|
||||||
// Priority: MEDIUM
|
// Priority: MEDIUM
|
||||||
// Time: 3-5 days
|
// Time: 3-5 days
|
||||||
@ -1093,7 +864,7 @@ public class PdfJsonConversionService {
|
|||||||
|
|
||||||
### Phase 4: Code Quality (1 week)
|
### Phase 4: Code Quality (1 week)
|
||||||
|
|
||||||
**7. Remove Dead Code (Issue #12)**
|
**7. Remove Dead Code (Issue #9)**
|
||||||
```java
|
```java
|
||||||
// Priority: LOW
|
// Priority: LOW
|
||||||
// Time: 1 day
|
// Time: 1 day
|
||||||
@ -1402,30 +1173,27 @@ alerts:
|
|||||||
The PDF JSON editor has **CRITICAL** issues that must be fixed before production deployment:
|
The PDF JSON editor has **CRITICAL** issues that must be fixed before production deployment:
|
||||||
|
|
||||||
### Must-Fix Issues (Blocks Production):
|
### Must-Fix Issues (Blocks Production):
|
||||||
1. **User-supplied jobId** - Security vulnerability enabling cache poisoning and information disclosure
|
1. **pageFontResources broken** - Lazy page loading completely broken (PDFont instances as keys)
|
||||||
2. **Mutable cache maps** - Causes ConcurrentModificationException under load
|
2. **Thread leaks** - Unbounded thread creation causes OutOfMemoryError
|
||||||
3. **Font UID collisions** - Different documents overwrite each other's font caches
|
|
||||||
4. **pageFontResources broken** - Lazy page loading completely broken
|
|
||||||
5. **Thread leaks** - Unbounded thread creation causes OutOfMemoryError
|
|
||||||
|
|
||||||
### Should-Fix Issues (Prevents Scale):
|
### Should-Fix Issues (Prevents Scale):
|
||||||
6. **Unbounded cache growth** - Memory leaks require server restarts
|
3. **Unbounded cache growth** - Memory leaks require server restarts
|
||||||
7. **Type3 cache races** - Wasted CPU doing duplicate work
|
4. **Type3 cache races** - Wasted CPU doing duplicate work
|
||||||
8. **PDDocument lifecycle** - Needs investigation (no evidence of actual problems yet)
|
5. **PDDocument lifecycle** - Needs investigation (no evidence of actual problems yet)
|
||||||
|
|
||||||
### Performance Improvements (Nice-to-Have):
|
### Performance Improvements (Nice-to-Have):
|
||||||
9. Full PDF reload per page
|
6. Full PDF reload per page
|
||||||
10. Large Base64 operations
|
7. Large Base64 operations
|
||||||
11. Synchronous file I/O
|
8. Synchronous file I/O
|
||||||
|
|
||||||
### Code Quality Issues:
|
### Code Quality Issues:
|
||||||
12. PdfLazyLoadingService dead code
|
9. PdfLazyLoadingService dead code
|
||||||
13. Documentation of thread-safety requirements
|
10. Documentation of thread-safety requirements
|
||||||
|
|
||||||
**Estimated Effort:**
|
**Estimated Effort:**
|
||||||
- Critical fixes: 2-3 weeks
|
- Critical fixes: 1-2 weeks (Issues #1, #2)
|
||||||
- High priority: 1 week
|
- High priority: 1 week (Issues #3, #4)
|
||||||
- Performance: 2-3 weeks
|
- Performance: 2-3 weeks (Issues #6-8)
|
||||||
- **Total:** 5-7 weeks for complete remediation
|
- **Total:** 4-6 weeks for complete remediation
|
||||||
|
|
||||||
**Recommendation:** Fix Critical issues immediately, then address High priority issues before beta testing.
|
**Recommendation:** Fix critical issues (#1, #2) immediately, then address high priority issues before beta testing.
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user