Stirling-PDF/CLAUDE.md
Reece Browne 507ad1dc61
Feature/v2/shared tool hooks (#4134)
# Description of Changes

<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Co-authored-by: Reece Browne <you@example.com>
2025-08-08 16:01:56 +01:00

11 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Common Development Commands

Build and Test

  • Build project: ./gradlew clean build
  • Run locally: ./gradlew bootRun
  • Full test suite: ./test.sh (builds all Docker variants and runs comprehensive tests)
  • Code formatting: ./gradlew spotlessApply (runs automatically before compilation)

Docker Development

  • Build ultra-lite: docker build -t stirlingtools/stirling-pdf:latest-ultra-lite -f ./Dockerfile.ultra-lite .
  • Build standard: docker build -t stirlingtools/stirling-pdf:latest -f ./Dockerfile .
  • Build fat version: docker build -t stirlingtools/stirling-pdf:latest-fat -f ./Dockerfile.fat .
  • Example compose files: Located in exampleYmlFiles/ directory

Security Mode Development

Set DOCKER_ENABLE_SECURITY=true environment variable to enable security features during development. This is required for testing the full version locally.

Frontend Development

  • Frontend dev server: cd frontend && npm run dev (requires backend on localhost:8080)
  • Tech Stack: Vite + React + TypeScript + Mantine UI + TailwindCSS
  • Proxy Configuration: Vite proxies /api/* calls to backend (localhost:8080)
  • Build Process: DO NOT run build scripts manually - builds are handled by CI/CD pipelines
  • Package Installation: DO NOT run npm install commands - package management handled separately
  • Deployment Options:
    • Desktop App: npm run tauri-build (native desktop application)
    • Web Server: npm run build then serve dist/ folder
    • Development: npm run tauri-dev for desktop dev mode

Multi-Tool Workflow Architecture

Frontend designed for stateful document processing:

  • Users upload PDFs once, then chain tools (split → merge → compress → view)
  • File state and processing results persist across tool switches
  • No file reloading between tools - performance critical for large PDFs (up to 100GB+)

FileContext - Central State Management

Location: src/contexts/FileContext.tsx

  • Active files: Currently loaded PDFs and their variants
  • Tool navigation: Current mode (viewer/pageEditor/fileEditor/toolName)
  • Memory management: PDF document cleanup, blob URL lifecycle, Web Worker management
  • IndexedDB persistence: File storage with thumbnail caching
  • Preview system: Tools can preview results (e.g., Split → Viewer → back to Split) without context pollution

Critical: All file operations go through FileContext. Don't bypass with direct file handling.

Processing Services

  • enhancedPDFProcessingService: Background PDF parsing and manipulation
  • thumbnailGenerationService: Web Worker-based with main-thread fallback
  • fileStorage: IndexedDB with LRU cache management

Memory Management Strategy

Why manual cleanup exists: Large PDFs (up to 100GB+) through multiple tools accumulate:

  • PDF.js documents that need explicit .destroy() calls
  • Blob URLs from tool outputs that need revocation
  • Web Workers that need termination Without cleanup: browser crashes with memory leaks.

Tool Development

Architecture: Modular hook-based system with clear separation of concerns:

  • useToolOperation (frontend/src/hooks/tools/shared/useToolOperation.ts): Main orchestrator hook

    • Coordinates all tool operations with consistent interface
    • Integrates with FileContext for operation tracking
    • Handles validation, error handling, and UI state management
  • Supporting Hooks:

    • useToolState: UI state management (loading, progress, error, files)
    • useToolApiCalls: HTTP requests and file processing
    • useToolResources: Blob URLs, thumbnails, ZIP downloads
  • Utilities:

    • toolErrorHandler: Standardized error extraction and i18n support
    • toolResponseProcessor: API response handling (single/zip/custom)
    • toolOperationTracker: FileContext integration utilities

Three Tool Patterns:

Pattern 1: Single-File Tools (Individual processing)

  • Backend processes one file per API call
  • Set multiFileEndpoint: false
  • Examples: Compress, Rotate
return useToolOperation({
  operationType: 'compress',
  endpoint: '/api/v1/misc/compress-pdf',
  buildFormData: (params, file: File) => { /* single file */ },
  multiFileEndpoint: false,
  filePrefix: 'compressed_'
});

Pattern 2: Multi-File Tools (Batch processing)

  • Backend accepts MultipartFile[] arrays in single API call
  • Set multiFileEndpoint: true
  • Examples: Split, Merge, Overlay
return useToolOperation({
  operationType: 'split',
  endpoint: '/api/v1/general/split-pages',
  buildFormData: (params, files: File[]) => { /* all files */ },
  multiFileEndpoint: true,
  filePrefix: 'split_'
});

Pattern 3: Complex Tools (Custom processing)

  • Tools with complex routing logic or non-standard processing
  • Provide customProcessor for full control
  • Examples: Convert, OCR
return useToolOperation({
  operationType: 'convert',
  customProcessor: async (params, files) => { /* custom logic */ },
  filePrefix: 'converted_'
});

Benefits:

  • No Timeouts: Operations run until completion (supports 100GB+ files)
  • Consistent: All tools follow same pattern and interface
  • Maintainable: Single responsibility hooks, easy to test and modify
  • i18n Ready: Built-in internationalization support
  • Type Safe: Full TypeScript support with generic interfaces
  • Memory Safe: Automatic resource cleanup and blob URL management

Architecture Overview

Project Structure

  • Backend: Spring Boot application with Thymeleaf templating
  • Frontend: React-based SPA in /frontend directory (Thymeleaf templates fully replaced)
    • File Storage: IndexedDB for client-side file persistence and thumbnails
    • Internationalization: JSON-based translations (converted from backend .properties)
  • PDF Processing: PDFBox for core PDF operations, LibreOffice for conversions, PDF.js for client-side rendering
  • Security: Spring Security with optional authentication (controlled by DOCKER_ENABLE_SECURITY)
  • Configuration: YAML-based configuration with environment variable overrides

Controller Architecture

  • API Controllers (src/main/java/.../controller/api/): REST endpoints for PDF operations
    • Organized by function: converters, security, misc, pipeline
    • Follow pattern: @RestController + @RequestMapping("/api/v1/...")
  • Web Controllers (src/main/java/.../controller/web/): Serve Thymeleaf templates
    • Pattern: @Controller + return template names

Key Components

  • SPDFApplication.java: Main application class with desktop UI and browser launching logic
  • ConfigInitializer: Handles runtime configuration and settings files
  • Pipeline System: Automated PDF processing workflows via PipelineController
  • Security Layer: Authentication, authorization, and user management (when enabled)

Component Architecture

  • React Components: Located in frontend/src/components/ and frontend/src/tools/
  • Static Assets: CSS, JS, and resources in src/main/resources/static/ (legacy) + frontend/public/ (modern)
  • Internationalization:
    • Backend: messages_*.properties files
    • Frontend: JSON files in frontend/public/locales/ (converted from .properties)
    • Conversion Script: scripts/convert_properties_to_json.py

Configuration Modes

  • Ultra-lite: Basic PDF operations only
  • Standard: Full feature set
  • Fat: Pre-downloaded dependencies for air-gapped environments
  • Security Mode: Adds authentication, user management, and enterprise features

Testing Strategy

  • Integration Tests: Cucumber tests in testing/cucumber/
  • Docker Testing: test.sh validates all Docker variants
  • Manual Testing: No unit tests currently - relies on UI and API testing

Development Workflow

  1. Local Development:
    • Backend: ./gradlew bootRun (runs on localhost:8080)
    • Frontend: cd frontend && npm run dev (runs on localhost:5173, proxies to backend)
  2. Docker Testing: Use ./test.sh before submitting PRs
  3. Code Style: Spotless enforces Google Java Format automatically
  4. Translations:
    • Backend: Use helper scripts in /scripts for multi-language updates
    • Frontend: Update JSON files in frontend/public/locales/ or use conversion script
  5. Documentation: API docs auto-generated and available at /swagger-ui/index.html

Frontend Architecture Status

  • Core Status: React SPA architecture complete with multi-tool workflow support
  • State Management: FileContext handles all file operations and tool navigation
  • File Processing: Production-ready with memory management for large PDF workflows (up to 100GB+)
  • Tool Integration: Modular hook architecture with useToolOperation orchestrator
    • Individual hooks: useToolState, useToolApiCalls, useToolResources
    • Utilities: toolErrorHandler, toolResponseProcessor, toolOperationTracker
    • Pattern: Each tool creates focused operation hook, UI consumes state/actions
  • Preview System: Tool results can be previewed without polluting file context (Split tool example)
  • Performance: Web Worker thumbnails, IndexedDB persistence, background processing

Important Notes

  • Java Version: Minimum JDK 17, supports and recommends JDK 21
  • Lombok: Used extensively - ensure IDE plugin is installed
  • Desktop Mode: Set STIRLING_PDF_DESKTOP_UI=true for desktop application mode
  • File Persistence:
    • Backend: Designed to be stateless - files are processed in memory/temp locations only
    • Frontend: Uses IndexedDB for client-side file storage and caching (with thumbnails)
  • Security: When DOCKER_ENABLE_SECURITY=false, security-related classes are excluded from compilation
  • FileContext: All file operations MUST go through FileContext - never bypass with direct File handling
  • Memory Management: Manual cleanup required for PDF.js documents and blob URLs - don't remove cleanup code
  • Tool Development: New tools should follow useToolOperation hook pattern (see useCompressOperation.ts)
  • Performance Target: Must handle PDFs up to 100GB+ without browser crashes
  • Preview System: Tools can preview results without polluting main file context (see Split tool implementation)

Communication Style

  • Be direct and to the point
  • No apologies or conversational filler
  • Answer questions directly without preamble
  • Explain reasoning concisely when asked
  • Avoid unnecessary elaboration

Decision Making

  • Ask clarifying questions before making assumptions
  • Stop and ask when uncertain about project-specific details
  • Confirm approach before making structural changes
  • Request guidance on preferences (cross-platform vs specific tools, etc.)
  • Verify understanding of requirements before proceeding