Stirling-PDF

mirror of https://github.com/Frooodle/Stirling-PDF.git synced 2026-05-01 23:16:31 +02:00

Author	SHA1	Message	Date
James Brunton	51f5345151	Inform AI engine which endpoints are disabled on the backend (#6251 ) # Description of Changes Have the Java send a list of enabled endpoints to the AI engine so it can intelligently respond to the user that the tool does exist but is disabled on the server so it can't acutally run the operation, instead of the current behaviour where it sends the API call back and then 503 errors because the execution fails when the URL is disabled. <img width="380" height="208" alt="image" src="https://github.com/user-attachments/assets/5842fb2e-2e55-45a5-8205-25515636daae" /> --------- Co-authored-by: EthanHealy01 <80844253+EthanHealy01@users.noreply.github.com>	2026-05-01 14:59:53 +00:00
James Brunton	5541dd666c	Flesh out RAG system (#6197 ) # Description of Changes Flesh out the RAG system and connect it to the PDF Question Agent so it can respond to questions about PDFs of an extremely large size. I'd expect lots more work will need to be done to finish off the RAG system to really be what we need, but this should be a reasonable start which will let us connect it to tools and have the ingestion mostly handled automatically. I'm leaving file deletion and proper file ID management to be done in a future PR. We also need to consider whether all tools should retrieve content exclusively via RAG, or whether it's beneficial to have tools sometimes fetch the direct content and other times fetch it from RAG. A diagram of the expected interaction is as follows: ```mermaid sequenceDiagram autonumber actor U as User participant FE as Frontend<br/>(ChatPanel) participant J as Java<br/>(AiWorkflowService) participant O as Engine:<br/>OrchestratorAgent participant QA as Engine:<br/>PdfQuestionAgent participant RAG as Engine:<br/>RagService + SqliteVecStore participant V as VoyageAI<br/>(embeddings) participant L as LLM<br/>(Claude / etc.) U->>FE: types "Summarise this PDF"<br/>(PDF already uploaded) FE->>J: POST /api/v1/ai/orchestrate/stream<br/>multipart: fileInputs[], userMessage Note over J: ByteHashFileIdStrategy<br/>id = sha256(bytes)[:16] J->>O: POST /api/v1/orchestrator<br/>{ files:[{id,name}], userMessage } O->>L: route via fast model L-->>O: delegate_pdf_question O->>QA: PdfQuestionRequest loop for each file QA->>RAG: has_collection(file.id) RAG-->>QA: false end QA-->>O: NeedIngestResponse(files_to_ingest) O-->>J: { outcome:"need_ingest", filesToIngest:[...] } Note over J: onNeedIngest loop per file J->>J: PDFBox: extract page text J->>O: POST /api/v1/rag/documents<br/>(long-running timeout) O->>RAG: chunk + stage documents O->>V: embed_documents (batches of 256) V-->>O: embeddings O->>RAG: add_documents O-->>J: { chunks_indexed: N } end Note over J: retry with resumeWith=pdf_question J->>O: POST /api/v1/orchestrator Note over O: fast-path to PdfQuestionAgent O->>QA: PdfQuestionRequest Note over QA: build RagCapability<br/>pinned to file IDs QA->>L: run(prompt) with search_knowledge tool loop up to max_searches L->>QA: search_knowledge(query) QA->>V: embed_query V-->>QA: query vector QA->>RAG: search(vector, collections=[file.id]) RAG-->>QA: top-k chunks QA-->>L: formatted chunks end Note over QA: once budget spent,<br/>prepare() hides the tool L-->>QA: PdfQuestionAnswerResponse QA-->>O: answer O-->>J: { outcome:"answer", answer, evidence } J-->>FE: SSE "result" FE->>U: assistant bubble ```	2026-05-01 14:11:54 +01:00
ConnorYoh	86774d556e	Pdf comment agent (#6196 ) Co-authored-by: James Brunton <jbrunton96@gmail.com>	2026-05-01 10:19:38 +01:00
James Brunton	1e3da14081	Change frontend `.env` files to be committed and have `.env.*.local` overrides (#6207 )	2026-04-25 13:09:59 +01:00
James Brunton	3e94157137	Add document context for edit agent (#6152 ) # Description of Changes Adds the ability for the Edit agent to request the content of the document before it decides which parameters it needs. This makes it able to process requests like `Split the document after the page containing the "My Section" section`, allowing for document context-based requests for all[^1] tools. I had to make a few changes elsewhere to make this work, including: - Moving the requesting of content out of the Question Agent and into a common location - Added specific API docs for the Split param because the generic ones were not specific enough for the AI to be able to reliably perform the correct operation - Fixed an issue in the tool models generator which caused the Redact params to only be half-generated (causing Pydantic to crash when the AI tried to run Redact) - Added missing logging to a bunch of tools and hooked it up properly so it'll print to stderr - Made the limits for the max pages/chars to extract from PDFs configurable via env var [^1]: Many of the tools can't actually do anything useful with the context at this stage, but will just need the tool API to be extended with new features like page-specific operations to be automatically able to do smart operations without needing to change the Edit agent itself.	2026-04-23 13:19:27 +00:00
Ludy	e087b54cf0	build(docker): pin base container images to immutable digests (#6173 )	2026-04-23 13:31:21 +01:00
James Brunton	975f135217	Move engine/AGENTS.md into root AGENTS.md because Claude doesn't bother to read it (#6151 ) # Description of Changes Move `engine/AGENTS.md` into root `AGENTS.md` because Claude doesn't bother to read it half the time.	2026-04-22 11:32:03 +02:00
James Brunton	3b2afe0deb	Change `engine/.env` to be committed and have `.env.local` override (#6150 ) # Description of Changes We keep adding stuff to `engine/config/.env.example` and have to manually update `.env` because of it, which is really clunky, especially when working on multiple worktrees at once. This PR changes it so that we just have a committed `.env` file and have an `.env.local` override to put the actual private keys into, which should make it a bit easier to manage. > [!warning] > > After this goes in, be very careful for a little while not to accidentally commit any keys that you've got inside your `.env` file!	2026-04-21 16:18:25 +01:00
James Brunton	2a856fbc19	Allow chat history to be sent to AI engine (#6128 ) # Description of Changes Add an extra parameter to every agent to receive the conversation history in addition to the current message. This will make it possible to answer followup questions from the AI without needing to give full context in your message.	2026-04-21 15:03:10 +00:00
Anthony Stirling	f779085d75	setup RAG (#6146 )	2026-04-21 12:42:33 +01:00
EthanHealy01	089e448cf4	allow deploypr:prototypes comment to spin up the prototypes build (#6144 ) Co-authored-by: James Brunton <jbrunton96@gmail.com>	2026-04-20 18:58:33 +01:00
James Brunton	e5767ed58b	Change AI engine to execute tools in Java instead of on frontend (#6116 ) # Description of Changes Redesign AI engine so that it autogenerates the `tool_models.py` file from the OpenAPI spec so the Python has access to the Java API parameters and the full list of Java tools that it can run. CI ensures that whenever someone modifies a tool endpoint that the AI enigne tool models get updated as well (the dev gets told to run `task engine:tool-models`). There's loads of advantages to having the Java be the one that actually executes the tools, rather than the frontend as it was previously set up to theoretically use: - The AI gets much better descriptions of the params from the API docs - It'll be usable headless in the future so a Java daemon could run to execute ops on files in a folder without the need for the UI to run - The Java already has all the logic it needs to execute the tools - We don't need to parse the TypeScript to find the API (which is hard because the TS wasn't designed to be computer-read to extract the API) I've also hooked up the prototype frontend to ensure it's working properly, and have built it in a way that all the tool names can be translated properly, which was always an issue with previous prototypes of this. --------- Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com> Co-authored-by: EthanHealy01 <80844253+EthanHealy01@users.noreply.github.com>	2026-04-20 15:57:11 +01:00
ConnorYoh	de8c483054	Feat/math validation agent (#6012 ) Co-authored-by: James Brunton <jbrunton96@gmail.com> Co-authored-by: EthanHealy01 <80844253+EthanHealy01@users.noreply.github.com>	2026-04-17 10:36:45 +01:00
ConnorYoh	702f4e5c2c	Add Taskfile for unified dev workflow across all components (#6080 ) ## Add Taskfile for unified dev workflow ### Summary - Introduces [Taskfile](https://taskfile.dev/) as the single CLI entry point for all development workflows across backend, frontend, engine, Docker, and desktop - ~80 tasks organized into 6 namespaces: `backend:`, `frontend:`, `engine:`, `docker:`, `desktop:`, plus root-level composites - All CI workflows migrated to use Task - Deletes `engine/Makefile` and `scripts/build-tauri-jlink.{sh,bat}` — replaced by Task equivalents - Removes redundant npm scripts (`dev`, `build`, `prep`, `lint`, `test`, `typecheck:all`) from `package.json` - Smart dependency caching: `sources`/`status`/`generates` fingerprinting, CI-aware `npm ci` vs `npm install`, `run: once` for parallel dep deduplication ### What this does NOT do - Does not replace Gradle, npm, or Docker — Taskfile is a thin orchestration wrapper - Does not change application code or behavior ### Install ``` npm install -g @go-task/cli # or: brew install go-task, winget install Task.Task ``` ### Quick start ``` task --list # discover all tasks task install # install all deps task dev # start backend + frontend task dev:all # also start AI engine task test # run all tests task check # quick quality gate (local dev) task check:all # full CI quality gate ``` ### Test plan - [ ] Install `task` CLI and run `task --list` — verify all tasks display - [ ] Run `task install` — verify frontend + engine deps install - [ ] Run `task dev` — verify backend + frontend start, Ctrl+C exits cleanly - [ ] Run `task frontend:check` — verify typecheck + lint + test pass - [ ] Run `task desktop:dev` — verify jlink builds are cached on second run - [ ] Verify CI passes on all workflows --------- Co-authored-by: James Brunton <jbrunton96@gmail.com>	2026-04-15 14:16:57 +00:00
James Brunton	2bf5f0b18e	Add tracking system to support optional PostHog tracking in AI engine (#6040 ) Co-authored-by: ConnorYoh <40631091+ConnorYoh@users.noreply.github.com>	2026-04-14 18:45:47 +01:00
aikido-autofix[bot]	33b2b5827a	[Aikido] Fix 16 security issues in fastmcp, aiohttp, cryptography and 1 more (#6091 ) Upgrade fastmcp, aiohttp, cryptography, and anthropic to fix critical SSRF/path traversal, header injection, OAuth confused deputy, and DoS vulnerabilities. <details> <summary>✅ 16 CVEs resolved by this upgrade, including 2 critical 🚨 CVEs</summary> <br> This PR will resolve the following CVEs: \| Issue \| Severity           \| Description \| \| --- \| --- \| --- \| \| <pre>[CVE-2026-32871](https://app.aikido.dev/issues/25944204/detail?groupId=70007#CVE-2026-32871)</pre> \| <pre>🚨 CRITICAL</pre> \| [fastmcp] Path traversal vulnerability in URL construction allows attackers to bypass API prefix restrictions and access arbitrary backend endpoints using unencoded path parameters, enabling authenticated SSRF attacks. \| \| <pre>[CVE-2026-27124](https://app.aikido.dev/issues/25944204/detail?groupId=70007#CVE-2026-27124)</pre> \| <pre>HIGH</pre> \| [fastmcp] OAuthProxy fails to validate user consent when receiving authorization codes from GitHub, allowing attackers to exploit GitHub's consent-skipping behavior to gain unauthorized access to FastMCP servers through a Confused Deputy attack. \| \| <pre>[CVE-2025-64340](https://app.aikido.dev/issues/25944204/detail?groupId=70007#CVE-2025-64340)</pre> \| <pre>MEDIUM</pre> \| [fastmcp] Server names with shell metacharacters can cause command injection on Windows when passed to install commands, allowing arbitrary code execution through cmd.exe interpretation of .cmd wrapper files. \| \| <pre>[CVE-2026-34520](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34520)</pre> \| <pre>🚨 CRITICAL</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, the C parser (the default for most installs) accepted null bytes and control characters in response headers. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34516](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34516)</pre> \| <pre>HIGH</pre> \| [aiohttp] A response with an excessive number of multipart headers can consume more memory than intended, leading to a denial of service (DoS) vulnerability through resource exhaustion. \| \| <pre>[CVE-2026-22815](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-22815)</pre> \| <pre>MEDIUM</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, insufficient restrictions in header/trailer handling could cause uncapped memory usage. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34515](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34515)</pre> \| <pre>MEDIUM</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, on Windows the static resource handler may expose information about a NTLMv2 remote path. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34525](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34525)</pre> \| <pre>MEDIUM</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, multiple Host headers were allowed in aiohttp. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34513](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34513)</pre> \| <pre>LOW</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, an unbounded DNS cache could result in excessive memory usage possibly resulting in a DoS situation. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34514](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34514)</pre> \| <pre>LOW</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, an attacker who controls the content_type parameter in aiohttp could use this to inject extra headers or similar exploits. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34517](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34517)</pre> \| <pre>LOW</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, for some multipart form fields, aiohttp read the entire field into memory before checking client_max_size. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-34518](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34518)</pre> \| <pre>LOW</pre> \| [aiohttp] When following redirects to a different origin, the framework fails to drop the Cookie and Proxy-Authorization headers alongside the Authorization header, potentially leaking sensitive authentication credentials to untrusted domains. \| \| <pre>[CVE-2026-34519](https://app.aikido.dev/issues/25944198/detail?groupId=70007#CVE-2026-34519)</pre> \| <pre>LOW</pre> \| [aiohttp] is an asynchronous HTTP client/server framework for asyncio and Python. Prior to version 3.13.4, an attacker who controls the reason parameter when creating a Response may be able to inject extra headers or similar exploits. This issue has been patched in version 3.13.4. \| \| <pre>[CVE-2026-39892](https://app.aikido.dev/issues/25637201/detail?groupId=70007#CVE-2026-39892)</pre> \| <pre>MEDIUM</pre> \| [cryptography] Non-contiguous buffers passed to cryptographic APIs can cause buffer overflows, potentially leading to memory corruption and arbitrary code execution. \| \| <pre>[CVE-2026-34452](https://app.aikido.dev/issues/25944200/detail?groupId=70007#CVE-2026-34452)</pre> \| <pre>MEDIUM</pre> \| [anthropic] A time-of-check-time-of-use (TOCTOU) vulnerability in the async filesystem memory tool allows local attackers to escape the sandbox directory via symlink manipulation, enabling arbitrary file read/write operations outside the intended memory directory. \| \| <pre>[CVE-2026-34450](https://app.aikido.dev/issues/25944200/detail?groupId=70007#CVE-2026-34450)</pre> \| <pre>MEDIUM</pre> \| [anthropic] The local filesystem memory tool created world-readable and potentially world-writable files, allowing local attackers to read persisted agent state or modify memory files to influence model behavior. \| </details> Co-authored-by: aikido-autofix[bot] <119856028+aikido-autofix[bot]@users.noreply.github.com>	2026-04-10 08:54:53 +00:00
James Brunton	b130242688	Add Java orchestrator to connect to the AI engine (#6003 ) # Description of Changes Add Java orchestration layer which can connect and go back and forth with the AI engine to get results for the user. It's expected that the AI engine will not be publicly available and this Java layer will always be in front of it, to manage sessions and auth etc.	2026-04-09 08:04:38 +00:00
James Brunton	e10c5f6283	Redesign Python AI engine (#5991 ) # Description of Changes Redesign the Python AI engine to be properly agentic and make use of `pydantic-ai` instead of `langchain` for correctness and ergonomics. This should be a good foundation for us to build our AI engine on going forwards.	2026-03-26 10:35:47 +00:00
James Brunton	c58a6092ec	Add SaaS AI engine (#5907 )	2026-03-16 11:01:50 +00:00

19 Commits