mirror of https://github.com/Frooodle/Stirling-PDF.git synced 2026-05-01 23:16:31 +02:00

Files

James Brunton 975f135217 Move engine/AGENTS.md into root AGENTS.md because Claude doesn't bother to read it (#6151 )

# Description of Changes
Move `engine/AGENTS.md` into root `AGENTS.md` because Claude doesn't
bother to read it half the time.

2026-04-22 11:32:03 +02:00

22 KiB

Raw Blame History

AGENTS.md

This file provides guidance to AI Agents when working with code in this repository.

Taskfile (Recommended)

This project uses Task as a unified command runner. All build, dev, test, lint, and docker commands can be run from the repo root via task <command>. Run task --list to see all available commands.

Quick Reference

task install — install all dependencies
task dev — start backend + frontend concurrently
task dev:all — start backend + frontend + engine concurrently
task build — build all components
task test — run all tests (backend + frontend + engine)
task lint — run all linters
task format — auto-fix formatting across all components
task check — full quality gate (lint + typecheck + test)
task clean — clean all build artifacts
task docker:build — build standard Docker image
task docker:up — start Docker compose stack

Common Development Commands

Build and Test

Build project: task build
Run backend locally: task backend:dev
Run all tests: task test (or individually: task backend:test, task frontend:test, task engine:test)
Docker integration tests: ./test.sh (builds all Docker variants and runs comprehensive tests)
Code formatting: task format (or task backend:format for Java only)
Full quality gate: task check (runs lint + typecheck + test across all components)

After modifying any files in the project, you must run the relevant task check command that covers that area of the code. For example, when editing frontend files run task frontend:check; for Python engine files run task engine:check; for Java backend files run task backend:check.

Docker Development

Build standard: task docker:build (or docker build -t stirling-pdf -f docker/embedded/Dockerfile .)
Build fat version: task docker:build:fat
Build ultra-lite: task docker:build:ultra-lite
Start compose stack: task docker:up (or task docker:up:fat, task docker:up:ultra-lite)
Stop compose stack: task docker:down
View logs: task docker:logs
Example compose files: Located in exampleYmlFiles/ directory

Security Mode Development

Set DOCKER_ENABLE_SECURITY=true environment variable to enable security features during development. This is required for testing the full version locally.

Python Development (AI Engine)

The engine is a Python reasoning service for Stirling: it plans and interprets work, but it does not own durable state, and it does not execute Stirling PDF operations directly. Keep the service narrow: typed contracts in, typed contracts out, with AI only where it adds reasoning value. The frontend calls the Python engine via Java as a proxy.

Python Commands

All engine commands run from the repo root using Task:

task engine:check — run all checks (typecheck + lint + format-check + test)
task engine:fix — auto-fix lint + formatting
task engine:install — install Python dependencies via uv
task engine:dev — start FastAPI with hot reload (localhost:5001)
task engine:test — run pytest
task engine:lint — run ruff linting
task engine:typecheck — run pyright
task engine:format — format code with ruff
task engine:tool-models — generate tool_models.py from the Java OpenAPI spec

The project structure is defined in engine/pyproject.toml. Any new dependencies should be listed there, followed by running task engine:install.

Python Code Style

Keep task engine:check passing.
Use modern Python when it improves clarity.
Prefer explicit names to cleverness.
Avoid nested functions and nested classes unless the language construct requires them.
Prefer composition to inheritance when combining concepts.
Avoid speculative abstractions. Add a layer only when it removes real duplication or clarifies lifecycle.
Add comments sparingly and only when they explain non-obvious intent.

Python Typing and Models

Deserialize into Pydantic models as early as possible.
Serialize from Pydantic models as late as possible.
Do not pass raw dict[str, Any] or dict[str, object] across important boundaries when a typed model can exist instead.
Avoid Any wherever possible.
Avoid cast() wherever possible (reconsider the structure first).
All shared models should subclass stirling.models.ApiModel so the service behaves consistently.
Do not use string literals for any type annotations, including cast().

Python Configuration

Keep application-owned configuration in stirling.config.
Only add STIRLING_* environment variables that the engine itself truly owns.
Do not mirror third-party provider environment variables unless the engine is actually interpreting them.
Let pydantic-ai own provider authentication configuration when possible.

Python Architecture

Package roles:

stirling.contracts: request/response models and shared typed workflow contracts. If a shape crosses a module or service boundary, it probably belongs here.
stirling.models: shared model primitives and generated tool models.
stirling.agents: reasoning modules for individual capabilities.
stirling.api: HTTP layer, dependency access, and app startup wiring.
stirling.services: shared runtime and non-AI infrastructure.
stirling.config: application-owned settings.

Source of truth:

stirling.models.tool_models is the source of truth for operation IDs and parameter models.
Do not duplicate operation lists if they can be derived from tool_models.OPERATIONS.
Do not hand-maintain parallel parameter schemas when the generated tool models already define them.
If a tool ID must match a parameter model, validate that relationship explicitly in code.

Boundaries:

Keep the API layer thin. Route modules should bind requests, resolve dependencies, and call agents or services. They should not contain business logic.
Keep agents focused on one reasoning domain. They should not own FastAPI routing, persistence, or execution of Stirling operations.
Build long-lived runtime objects centrally at startup when possible rather than reconstructing heavy AI objects per request.
If an agent delegates to another agent, the delegated agent should remain the source of truth for its own domain output.

Python AI Usage

The system must work with any AI, including self-hosted models. We require that the models support structured outputs, but should minimise model-specific code beyond that.
Use AI for reasoning-heavy outputs, not deterministic glue.
Do not ask the model to invent data that Python can derive safely.
Do not fabricate fallback user-facing copy in code to hide incomplete model output.
AI output schemas should be impossible to instantiate incorrectly.
- Do not require the model to keep separate structures in sync. For example, instead of generating two lists which must be the same length, generate one list of a model containing the same data.
- Prefer Python to derive deterministic follow-up structure from a valid AI result.
Use NativeOutput(...) for structured model outputs.
Use ToolOutput(...) when the model should select and call delegate functions.

Python Testing

Test contracts directly.
Test agents directly where behaviour matters.
Test API routes as thin integration points.
Prefer dependency overrides or startup-state seams to monkeypatching random globals.

Frontend Development

Frontend dev server: task frontend:dev — requires backend on localhost:8080
Tech Stack: Vite + React + TypeScript + Mantine UI + TailwindCSS
Proxy Configuration: Vite proxies /api/* calls to backend (localhost:8080)
Build Process: DO NOT run build scripts manually - builds are handled by CI/CD pipelines
Package Installation: task frontend:install
Deployment Options:
- Desktop App: task desktop:build
- Web Server: task frontend:build then serve dist/ folder
- Development: task desktop:dev for desktop dev mode

Environment Variables

All VITE_* variables must be declared in the appropriate example file:
- frontend/config/.env.example — core, proprietary, and shared vars
- frontend/config/.env.saas.example — SaaS-only vars
- frontend/config/.env.desktop.example — desktop (Tauri)-only vars
Never use || 'hardcoded-fallback' inline — put defaults in the example files
task frontend:prepare / prepare:saas / prepare:desktop auto-create the env files from examples on first run, and error if any required keys are missing
Prepare runs automatically as a dependency of all dev*, build*, and desktop* tasks
See frontend/README.md#environment-variables for full documentation

Import Paths - CRITICAL

ALWAYS use @app/* for imports. Do not use @core/* or @proprietary/* unless explicitly wrapping/extending a lower layer implementation.

For a broader explanation of the frontend layering and override architecture, see frontend/DeveloperGuide.md.

// ✅ CORRECT - Use @app/* for all imports
import { AppLayout } from "@app/components/AppLayout";
import { useFileContext } from "@app/contexts/FileContext";
import { FileContext } from "@app/contexts/FileContext";

// ❌ WRONG - Do not use @core/* or @proprietary/* in normal code
import { AppLayout } from "@core/components/AppLayout";
import { useFileContext } from "@proprietary/contexts/FileContext";

Only use explicit aliases when:

Building layer-specific override that wraps a lower layer's component
Example: import { AppProviders as CoreAppProviders } from "@core/components/AppProviders" when creating proprietary/AppProviders.tsx that extends the core version

The @app/* alias automatically resolves to the correct layer based on build target (core/proprietary/desktop) and handles the fallback cascade.

Component Override Pattern (Stub/Shadow)

Use this pattern for desktop-specific or proprietary-specific features WITHOUT runtime checks or conditionals.

How it works:

Core defines stub component (returns null or no-op)
Desktop/proprietary overrides with same path/name
Core imports via @app/* - higher layer "shadows" core in those builds
No @ts-ignore, no isTauri() checks, no runtime conditionals!

Example - Desktop-specific footer:

// core/components/rightRail/RightRailFooterExtensions.tsx (stub)
interface RightRailFooterExtensionsProps {
  className?: string;
}

export function RightRailFooterExtensions(_props: RightRailFooterExtensionsProps) {
  return null; // Stub - does nothing in web builds
}

// desktop/components/rightRail/RightRailFooterExtensions.tsx (real implementation)
import { Box } from '@mantine/core';
import { BackendHealthIndicator } from '@app/components/BackendHealthIndicator';

interface RightRailFooterExtensionsProps {
  className?: string;
}

export function RightRailFooterExtensions({ className }: RightRailFooterExtensionsProps) {
  return (
    <Box className={className}>
      <BackendHealthIndicator />
    </Box>
  );
}

// core/components/shared/RightRail.tsx (usage - works in ALL builds)
import { RightRailFooterExtensions } from '@app/components/rightRail/RightRailFooterExtensions';

export function RightRail() {
  return (
    <div>
      {/* In web builds: renders nothing (stub returns null) */}
      {/* In desktop builds: renders BackendHealthIndicator */}
      <RightRailFooterExtensions className="right-rail-footer" />
    </div>
  );
}

Build resolution:

Core build: @app/* → core/* → Gets stub (returns null)
Desktop build: @app/* → desktop/* → Gets real implementation (shadows core)

Benefits:

No runtime checks or feature flags
Type-safe across all builds
Clean, readable code
Build-time optimization (dead code elimination)

Multi-Tool Workflow Architecture

Frontend designed for stateful document processing:

Users upload PDFs once, then chain tools (split → merge → compress → view)
File state and processing results persist across tool switches
No file reloading between tools - performance critical for large PDFs (up to 100GB+)

FileContext - Central State Management

Location: frontend/src/core/contexts/FileContext.tsx

Active files: Currently loaded PDFs and their variants
Tool navigation: Current mode (viewer/pageEditor/fileEditor/toolName)
Memory management: PDF document cleanup, blob URL lifecycle, Web Worker management
IndexedDB persistence: File storage with thumbnail caching
Preview system: Tools can preview results (e.g., Split → Viewer → back to Split) without context pollution

Critical: All file operations go through FileContext. Don't bypass with direct file handling.

Processing Services

enhancedPDFProcessingService: Background PDF parsing and manipulation
thumbnailGenerationService: Web Worker-based with main-thread fallback
fileStorage: IndexedDB with LRU cache management

Memory Management Strategy

Why manual cleanup exists: Large PDFs (up to 100GB+) through multiple tools accumulate:

PDF.js documents that need explicit .destroy() calls
Blob URLs from tool outputs that need revocation
Web Workers that need termination Without cleanup: browser crashes with memory leaks.

Tool Development

Architecture: Modular hook-based system with clear separation of concerns:

useToolOperation (frontend/src/core/hooks/tools/shared/useToolOperation.ts): Main orchestrator hook
- Coordinates all tool operations with consistent interface
- Integrates with FileContext for operation tracking
- Handles validation, error handling, and UI state management
Supporting Hooks:
- useToolState: UI state management (loading, progress, error, files)
- useToolApiCalls: HTTP requests and file processing
- useToolResources: Blob URLs, thumbnails, ZIP downloads
Utilities:
- toolErrorHandler: Standardized error extraction and i18n support
- toolResponseProcessor: API response handling (single/zip/custom)
- toolOperationTracker: FileContext integration utilities

Three Tool Patterns:

Pattern 1: Single-File Tools (Individual processing)

Backend processes one file per API call
Set multiFileEndpoint: false
Examples: Compress, Rotate

return useToolOperation({
  operationType: 'compress',
  endpoint: '/api/v1/misc/compress-pdf',
  buildFormData: (params, file: File) => { /* single file */ },
  multiFileEndpoint: false,
});

Pattern 2: Multi-File Tools (Batch processing)

Backend accepts MultipartFile[] arrays in single API call
Set multiFileEndpoint: true
Examples: Split, Merge, Overlay

return useToolOperation({
  operationType: 'split',
  endpoint: '/api/v1/general/split-pages',
  buildFormData: (params, files: File[]) => { /* all files */ },
  multiFileEndpoint: true,
  filePrefix: 'split_',
});

Pattern 3: Complex Tools (Custom processing)

Tools with complex routing logic or non-standard processing
Provide customProcessor for full control
Examples: Convert, OCR

return useToolOperation({
  operationType: 'convert',
  customProcessor: async (params, files) => { /* custom logic */ },
});

Benefits:

No Timeouts: Operations run until completion (supports 100GB+ files)
Consistent: All tools follow same pattern and interface
Maintainable: Single responsibility hooks, easy to test and modify
i18n Ready: Built-in internationalization support
Type Safe: Full TypeScript support with generic interfaces
Memory Safe: Automatic resource cleanup and blob URL management

Architecture Overview

Project Structure

Backend: Spring Boot application
Frontend: React-based SPA in /frontend directory
- File Storage: IndexedDB for client-side file persistence and thumbnails
- Internationalization: JSON-based translations (converted from backend .properties)
PDF Processing: PDFBox for core PDF operations, LibreOffice for conversions, PDF.js for client-side rendering
Security: Spring Security with optional authentication (controlled by DOCKER_ENABLE_SECURITY)
Configuration: YAML-based configuration with environment variable overrides

Controller Architecture

API Controllers (src/main/java/.../controller/api/): REST endpoints for PDF operations
- Organized by function: converters, security, misc, pipeline
- Follow pattern: @RestController + @RequestMapping("/api/v1/...")

Key Components

SPDFApplication.java: Main application class with desktop UI and browser launching logic
ConfigInitializer: Handles runtime configuration and settings files
Pipeline System: Automated PDF processing workflows via PipelineController
Security Layer: Authentication, authorization, and user management (when enabled)

Frontend Directory Structure

The frontend is organized with a clear separation of concerns:

frontend/src/core/: Main application code (shared, production-ready components)
- core/components/: React components organized by feature
  - core/components/tools/: Individual PDF tool implementations
  - core/components/viewer/: PDF viewer components
  - core/components/pageEditor/: Page manipulation UI
  - core/components/tooltips/: Help tooltips for tools
  - core/components/shared/: Reusable UI components
- core/contexts/: React Context providers
  - FileContext.tsx: Central file state management
  - file/: File reducer and selectors
  - toolWorkflow/: Tool workflow state
- core/hooks/: Custom React hooks
  - hooks/tools/: Tool-specific operation hooks (one directory per tool)
  - hooks/tools/shared/: Shared hook utilities (useToolOperation, etc.)
- core/constants/: Application constants and configuration
- core/data/: Static data (tool taxonomy, etc.)
- core/services/: Business logic services (PDF processing, storage, etc.)
frontend/src/desktop/: Desktop-specific (Tauri) code
frontend/src/proprietary/: Proprietary/licensed features
frontend/src-tauri/: Tauri (Rust) native desktop application code
frontend/public/: Static assets served directly
- public/locales/: Translation JSON files

Component Architecture

Static Assets: CSS, JS, and resources in src/main/resources/static/ (legacy) + frontend/public/ (modern)
Internationalization:
- Backend: messages_*.properties files
- Frontend: JSON files in frontend/public/locales/ (converted from .properties)
- Conversion Script: scripts/convert_properties_to_json.py

Configuration Modes

Ultra-lite: Basic PDF operations only
Standard: Full feature set
Fat: Pre-downloaded dependencies for air-gapped environments
Security Mode: Adds authentication, user management, and enterprise features

Testing Strategy

Integration Tests: Cucumber tests in testing/cucumber/
Docker Testing: test.sh validates all Docker variants
Manual Testing: No unit tests currently - relies on UI and API testing

Development Workflow

Local Development (using Taskfile):
- Backend + frontend: task dev
- All services (including AI engine): task dev:all
- Or individually: task backend:dev (localhost:8080), task frontend:dev (localhost:5173), task engine:dev (localhost:5001)
Quality Gate: Run task check before submitting PRs
Docker Testing: Use ./test.sh for full Docker integration tests
Code Style: Spotless enforces Google Java Format automatically (task backend:format)
Translations:
- Backend: Use helper scripts in /scripts for multi-language updates
- Frontend: Update JSON files in frontend/public/locales/ or use conversion script
Documentation: API docs auto-generated and available at /swagger-ui/index.html

Frontend Architecture Status

Core Status: React SPA architecture complete with multi-tool workflow support
State Management: FileContext handles all file operations and tool navigation
File Processing: Production-ready with memory management for large PDF workflows (up to 100GB+)
Tool Integration: Modular hook architecture with useToolOperation orchestrator
- Individual hooks: useToolState, useToolApiCalls, useToolResources
- Utilities: toolErrorHandler, toolResponseProcessor, toolOperationTracker
- Pattern: Each tool creates focused operation hook, UI consumes state/actions
Preview System: Tool results can be previewed without polluting file context (Split tool example)
Performance: Web Worker thumbnails, IndexedDB persistence, background processing

Translation Rules

CRITICAL: Always update translations in en-GB only, never en-US
Translation files are located in frontend/public/locales/

Important Notes

Java Version: Minimum JDK 21, supports and recommends JDK 25
Lombok: Used extensively - ensure IDE plugin is installed
File Persistence:
- Backend: Designed to be stateless - files are processed in memory/temp locations only
- Frontend: Uses IndexedDB for client-side file storage and caching (with thumbnails)
Security: When DOCKER_ENABLE_SECURITY=false, security-related classes are excluded from compilation
Import Paths: ALWAYS use @app/* for imports - never use @core/* or @proprietary/* unless explicitly wrapping/extending a lower layer
FileContext: All file operations MUST go through FileContext - never bypass with direct File handling
Memory Management: Manual cleanup required for PDF.js documents and blob URLs - don't remove cleanup code
Tool Development: New tools should follow useToolOperation hook pattern (see useCompressOperation.ts)
Performance Target: Must handle PDFs up to 100GB+ without browser crashes
Preview System: Tools can preview results without polluting main file context (see Split tool implementation)
Adding Tools: See ADDING_TOOLS.md for complete guide to creating new PDF tools

Communication Style

Be direct and to the point
No apologies or conversational filler
Answer questions directly without preamble
Explain reasoning concisely when asked
Avoid unnecessary elaboration

Decision Making

Ask clarifying questions before making assumptions
Stop and ask when uncertain about project-specific details
Confirm approach before making structural changes
Request guidance on preferences (cross-platform vs specific tools, etc.)
Verify understanding of requirements before proceeding

22 KiB Raw Blame History