From 975f1352170ea400cfea599fa7ff9bf82b962da1 Mon Sep 17 00:00:00 2001 From: James Brunton Date: Wed, 22 Apr 2026 10:32:03 +0100 Subject: [PATCH] Move engine/AGENTS.md into root AGENTS.md because Claude doesn't bother to read it (#6151) # Description of Changes Move `engine/AGENTS.md` into root `AGENTS.md` because Claude doesn't bother to read it half the time. --- AGENTS.md | 87 +++++++++++++++++++++++++++++++++++++++++----- engine/AGENTS.md | 90 ------------------------------------------------ 2 files changed, 79 insertions(+), 98 deletions(-) delete mode 100644 engine/AGENTS.md diff --git a/AGENTS.md b/AGENTS.md index 6fe3cacc99..5ed391821e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -43,15 +43,86 @@ After modifying any files in the project, you must run the relevant `task check` ### Security Mode Development Set `DOCKER_ENABLE_SECURITY=true` environment variable to enable security features during development. This is required for testing the full version locally. -### Python Development -Development for the AI engine happens in the `engine/` folder. The frontend calls the Python via Java as a proxy. +### Python Development (AI Engine) -- Follow the engine-specific guidance in [engine/AGENTS.md](engine/AGENTS.md) for Python architecture, code style, and AI usage. -- Use Task commands from the repo root: - - `task engine:check` — lint, type-check, test - - `task engine:fix` — auto-fix linting and formatting - - `task engine:install` — install dependencies -- The project structure is defined in `engine/pyproject.toml`. Any new dependencies should be listed there, followed by running `task engine:install`. +The engine is a Python reasoning service for Stirling: it plans and interprets work, but it does not own durable state, and it does not execute Stirling PDF operations directly. Keep the service narrow: typed contracts in, typed contracts out, with AI only where it adds reasoning value. The frontend calls the Python engine via Java as a proxy. + +#### Python Commands +All engine commands run from the repo root using Task: +- `task engine:check` — run all checks (typecheck + lint + format-check + test) +- `task engine:fix` — auto-fix lint + formatting +- `task engine:install` — install Python dependencies via uv +- `task engine:dev` — start FastAPI with hot reload (localhost:5001) +- `task engine:test` — run pytest +- `task engine:lint` — run ruff linting +- `task engine:typecheck` — run pyright +- `task engine:format` — format code with ruff +- `task engine:tool-models` — generate `tool_models.py` from the Java OpenAPI spec + +The project structure is defined in `engine/pyproject.toml`. Any new dependencies should be listed there, followed by running `task engine:install`. + +#### Python Code Style +- Keep `task engine:check` passing. +- Use modern Python when it improves clarity. +- Prefer explicit names to cleverness. +- Avoid nested functions and nested classes unless the language construct requires them. +- Prefer composition to inheritance when combining concepts. +- Avoid speculative abstractions. Add a layer only when it removes real duplication or clarifies lifecycle. +- Add comments sparingly and only when they explain non-obvious intent. + +#### Python Typing and Models +- Deserialize into Pydantic models as early as possible. +- Serialize from Pydantic models as late as possible. +- Do not pass raw `dict[str, Any]` or `dict[str, object]` across important boundaries when a typed model can exist instead. +- Avoid `Any` wherever possible. +- Avoid `cast()` wherever possible (reconsider the structure first). +- All shared models should subclass `stirling.models.ApiModel` so the service behaves consistently. +- Do not use string literals for any type annotations, including `cast()`. + +#### Python Configuration +- Keep application-owned configuration in `stirling.config`. +- Only add `STIRLING_*` environment variables that the engine itself truly owns. +- Do not mirror third-party provider environment variables unless the engine is actually interpreting them. +- Let `pydantic-ai` own provider authentication configuration when possible. + +#### Python Architecture + +**Package roles:** +- `stirling.contracts`: request/response models and shared typed workflow contracts. If a shape crosses a module or service boundary, it probably belongs here. +- `stirling.models`: shared model primitives and generated tool models. +- `stirling.agents`: reasoning modules for individual capabilities. +- `stirling.api`: HTTP layer, dependency access, and app startup wiring. +- `stirling.services`: shared runtime and non-AI infrastructure. +- `stirling.config`: application-owned settings. + +**Source of truth:** +- `stirling.models.tool_models` is the source of truth for operation IDs and parameter models. +- Do not duplicate operation lists if they can be derived from `tool_models.OPERATIONS`. +- Do not hand-maintain parallel parameter schemas when the generated tool models already define them. +- If a tool ID must match a parameter model, validate that relationship explicitly in code. + +**Boundaries:** +- Keep the API layer thin. Route modules should bind requests, resolve dependencies, and call agents or services. They should not contain business logic. +- Keep agents focused on one reasoning domain. They should not own FastAPI routing, persistence, or execution of Stirling operations. +- Build long-lived runtime objects centrally at startup when possible rather than reconstructing heavy AI objects per request. +- If an agent delegates to another agent, the delegated agent should remain the source of truth for its own domain output. + +#### Python AI Usage +- The system must work with any AI, including self-hosted models. We require that the models support structured outputs, but should minimise model-specific code beyond that. +- Use AI for reasoning-heavy outputs, not deterministic glue. +- Do not ask the model to invent data that Python can derive safely. +- Do not fabricate fallback user-facing copy in code to hide incomplete model output. +- AI output schemas should be impossible to instantiate incorrectly. + - Do not require the model to keep separate structures in sync. For example, instead of generating two lists which must be the same length, generate one list of a model containing the same data. + - Prefer Python to derive deterministic follow-up structure from a valid AI result. +- Use `NativeOutput(...)` for structured model outputs. +- Use `ToolOutput(...)` when the model should select and call delegate functions. + +#### Python Testing +- Test contracts directly. +- Test agents directly where behaviour matters. +- Test API routes as thin integration points. +- Prefer dependency overrides or startup-state seams to monkeypatching random globals. ### Frontend Development - **Frontend dev server**: `task frontend:dev` — requires backend on localhost:8080 diff --git a/engine/AGENTS.md b/engine/AGENTS.md deleted file mode 100644 index 8e45662740..0000000000 --- a/engine/AGENTS.md +++ /dev/null @@ -1,90 +0,0 @@ -# Stirling AI Engine Guide - -This file is for AI agents working in `engine/`. - -The engine is a Python reasoning service for Stirling. It plans and interprets work, but it does not own durable state, and it does not execute Stirling PDF operations directly. Keep the service narrow: typed contracts in, typed contracts out, with AI only where it adds reasoning value. - -## Commands - -All engine commands can be run from the repository root using Task: - -- `task engine:check` — run all checks (typecheck + lint + format-check + test) -- `task engine:fix` — auto-fix lint + formatting -- `task engine:install` — install Python dependencies via uv -- `task engine:dev` — start FastAPI with hot reload (localhost:5001) -- `task engine:test` — run pytest -- `task engine:lint` — run ruff linting -- `task engine:typecheck` — run pyright -- `task engine:format` — format code with ruff -- `task engine:tool-models` — generate tool_models.py from Java OpenAPI spec - -## Code Style - -- Keep `task engine:check` passing. -- Use modern Python when it improves clarity. -- Prefer explicit names to cleverness. -- Avoid nested functions and nested classes unless the language construct requires them. -- Prefer composition to inheritance when combining concepts. -- Avoid speculative abstractions. Add a layer only when it removes real duplication or clarifies lifecycle. -- Add comments sparingly and only when they explain non-obvious intent. - -### Typing and Models - -- Deserialize into Pydantic models as early as possible. -- Serialize from Pydantic models as late as possible. -- Do not pass raw `dict[str, Any]` or `dict[str, object]` across important boundaries when a typed model can exist instead. -- Avoid `Any` wherever possible. -- Avoid `cast()` wherever possible (reconsider the structure first). -- All shared models should subclass `stirling.models.ApiModel` so the service behaves consistently. -- Do not use string literals for any type annotations, including `cast()`. - -### Configuration - -- Keep application-owned configuration in `stirling.config`. -- Only add `STIRLING_*` environment variables that the engine itself truly owns. -- Do not mirror third-party provider environment variables unless the engine is actually interpreting them. -- Let `pydantic-ai` own provider authentication configuration when possible. - -## Architecture - -### Package Roles - -- `stirling.contracts`: request/response models and shared typed workflow contracts. If a shape crosses a module or service boundary, it probably belongs here. -- `stirling.models`: shared model primitives and generated tool models. -- `stirling.agents`: reasoning modules for individual capabilities. -- `stirling.api`: HTTP layer, dependency access, and app startup wiring. -- `stirling.services`: shared runtime and non-AI infrastructure. -- `stirling.config`: application-owned settings. - -### Source Of Truth - -- `stirling.models.tool_models` is the source of truth for operation IDs and parameter models. -- Do not duplicate operation lists if they can be derived from `tool_models.OPERATIONS`. -- Do not hand-maintain parallel parameter schemas when the generated tool models already define them. -- If a tool ID must match a parameter model, validate that relationship explicitly in code. - -### Boundaries - -- Keep the API layer thin. Route modules should bind requests, resolve dependencies, and call agents or services. They should not contain business logic. -- Keep agents focused on one reasoning domain. They should not own FastAPI routing, persistence, or execution of Stirling operations. -- Build long-lived runtime objects centrally at startup when possible rather than reconstructing heavy AI objects per request. -- If an agent delegates to another agent, the delegated agent should remain the source of truth for its own domain output. - -## AI Usage - -- The system must work with any AI, including self-hosted models. We require that the models support structured outputs, but should minimise model-specific code beyond that. -- Use AI for reasoning-heavy outputs, not deterministic glue. -- Do not ask the model to invent data that Python can derive safely. -- Do not fabricate fallback user-facing copy in code to hide incomplete model output. -- AI output schemas should be impossible to instantiate incorrectly. - - Do not require the model to keep separate structures in sync. For example, instead of generating two lists which must be the same length, generate one list of a model containing the same data. - - Prefer Python to derive deterministic follow-up structure from a valid AI result. -- Use `NativeOutput(...)` for structured model outputs. -- Use `ToolOutput(...)` when the model should select and call delegate functions. - -## Testing - -- Test contracts directly. -- Test agents directly where behaviour matters. -- Test API routes as thin integration points. -- Prefer dependency overrides or startup-state seams to monkeypatching random globals.