diff --git a/CLAUDE.md b/CLAUDE.md index 05bfb5254..8bdd7c235 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -25,23 +25,54 @@ Set `DOCKER_ENABLE_SECURITY=true` environment variable to enable security featur - **Proxy Configuration**: Vite proxies `/api/*` calls to backend (localhost:8080) - **Build Process**: DO NOT run build scripts manually - builds are handled by CI/CD pipelines - **Package Installation**: DO NOT run npm install commands - package management handled separately +- **Deployment Options**: + - **Desktop App**: `npm run tauri-build` (native desktop application) + - **Web Server**: `npm run build` then serve dist/ folder + - **Development**: `npm run tauri-dev` for desktop dev mode -#### Tailwind CSS Setup (if not already installed) -```bash -cd frontend -npm install -D tailwindcss postcss autoprefixer -npx tailwindcss init -p -``` +#### Multi-Tool Workflow Architecture +Frontend designed for **stateful document processing**: +- Users upload PDFs once, then chain tools (split → merge → compress → view) +- File state and processing results persist across tool switches +- No file reloading between tools - performance critical for large PDFs (up to 100GB+) + +#### FileContext - Central State Management +**Location**: `src/contexts/FileContext.tsx` +- **Active files**: Currently loaded PDFs and their variants +- **Tool navigation**: Current mode (viewer/pageEditor/fileEditor/toolName) +- **Memory management**: PDF document cleanup, blob URL lifecycle, Web Worker management +- **IndexedDB persistence**: File storage with thumbnail caching +- **Preview system**: Tools can preview results (e.g., Split → Viewer → back to Split) without context pollution + +**Critical**: All file operations go through FileContext. Don't bypass with direct file handling. + +#### Processing Services +- **enhancedPDFProcessingService**: Background PDF parsing and manipulation +- **thumbnailGenerationService**: Web Worker-based with main-thread fallback +- **fileStorage**: IndexedDB with LRU cache management + +#### Memory Management Strategy +**Why manual cleanup exists**: Large PDFs (up to 100GB+) through multiple tools accumulate: +- PDF.js documents that need explicit .destroy() calls +- Blob URLs from tool outputs that need revocation +- Web Workers that need termination +Without cleanup: browser crashes with memory leaks. + +#### Tool Development +- **Pattern**: Follow `src/tools/Split.tsx` as reference implementation +- **File Access**: Tools receive `selectedFiles` prop (computed from activeFiles based on user selection) +- **File Selection**: Users select files in FileEditor (tool mode) → stored as IDs → computed to File objects for tools +- **Integration**: All files are part of FileContext ecosystem - automatic memory management and operation tracking +- **Parameters**: Tool parameter handling patterns still being standardized +- **Preview Integration**: Tools can implement preview functionality (see Split tool's thumbnail preview) ## Architecture Overview ### Project Structure - **Backend**: Spring Boot application with Thymeleaf templating -- **Frontend**: React-based SPA in `/frontend` directory (replacing legacy Thymeleaf templates) - - **Current Status**: Active development to replace Thymeleaf UI with modern React SPA +- **Frontend**: React-based SPA in `/frontend` directory (Thymeleaf templates fully replaced) - **File Storage**: IndexedDB for client-side file persistence and thumbnails - **Internationalization**: JSON-based translations (converted from backend .properties) - - **URL Parameters**: Deep linking support for tool states and configurations - **PDF Processing**: PDFBox for core PDF operations, LibreOffice for conversions, PDF.js for client-side rendering - **Security**: Spring Security with optional authentication (controlled by `DOCKER_ENABLE_SECURITY`) - **Configuration**: YAML-based configuration with environment variable overrides @@ -59,9 +90,8 @@ npx tailwindcss init -p - **Pipeline System**: Automated PDF processing workflows via `PipelineController` - **Security Layer**: Authentication, authorization, and user management (when enabled) -### Template System (Legacy + Modern) -- **Legacy Thymeleaf Templates**: Located in `src/main/resources/templates/` (being phased out) -- **Modern React Components**: Located in `frontend/src/components/` and `frontend/src/tools/` +### Component Architecture +- **React Components**: Located in `frontend/src/components/` and `frontend/src/tools/` - **Static Assets**: CSS, JS, and resources in `src/main/resources/static/` (legacy) + `frontend/public/` (modern) - **Internationalization**: - Backend: `messages_*.properties` files @@ -91,13 +121,14 @@ npx tailwindcss init -p - Frontend: Update JSON files in `frontend/public/locales/` or use conversion script 5. **Documentation**: API docs auto-generated and available at `/swagger-ui/index.html` -## Frontend Migration Notes +## Frontend Architecture Status -- **Current Branch**: `feature/react-overhaul` - Active React SPA development -- **Migration Status**: Core tools (Split, Merge, Compress) converted to React with URL parameter support -- **File Management**: Implemented IndexedDB storage with thumbnail generation using PDF.js -- **Tools Architecture**: Each tool receives `params` and `updateParams` for URL state synchronization -- **Remaining Work**: Convert remaining Thymeleaf templates to React components +- **Core Status**: React SPA architecture complete with multi-tool workflow support +- **State Management**: FileContext handles all file operations and tool navigation +- **File Processing**: Production-ready with memory management for large PDF workflows (up to 100GB+) +- **Tool Integration**: Standardized tool interface - see `src/tools/Split.tsx` as reference +- **Preview System**: Tool results can be previewed without polluting file context (Split tool example) +- **Performance**: Web Worker thumbnails, IndexedDB persistence, background processing ## Important Notes @@ -108,6 +139,11 @@ npx tailwindcss init -p - **Backend**: Designed to be stateless - files are processed in memory/temp locations only - **Frontend**: Uses IndexedDB for client-side file storage and caching (with thumbnails) - **Security**: When `DOCKER_ENABLE_SECURITY=false`, security-related classes are excluded from compilation +- **FileContext**: All file operations MUST go through FileContext - never bypass with direct File handling +- **Memory Management**: Manual cleanup required for PDF.js documents and blob URLs - don't remove cleanup code +- **Tool Development**: New tools should follow Split tool pattern (`src/tools/Split.tsx`) +- **Performance Target**: Must handle PDFs up to 100GB+ without browser crashes +- **Preview System**: Tools can preview results without polluting main file context (see Split tool implementation) ## Communication Style - Be direct and to the point