Files
QodeAssist/docs/target-architecture.md
2026-06-28 17:38:08 +02:00

655 lines
28 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# QodeAssist — Target Architecture (v1.0)
Status: design baseline, derived from the fixed use-case inventory below.
Scope: the complete plugin, designed "from scratch" — what the architecture
should be if nothing legacy constrained it. The current code (see
`architecture.md`) already converges on this; §10 lists the remaining deltas.
---
## 1. Use-case inventory (requirements baseline)
Every architectural decision below is justified by one of these. Features not
on this list (Rules system, legacy provider/model/template pickers, Stack A)
are intentionally out of scope.
| # | Use case | What the user gets |
|---|----------|--------------------|
| U1 | **Code completion** | Inline FIM/instruct suggestions via LSP; auto + manual trigger, multiline, smart-context suppression, accept full / word-by-word |
| U2 | **Chat assistant** | 4 placements (sidebar, bottom pane, editor tab, floating window); streaming text + thinking blocks + tool blocks + file-edit blocks (apply/undo); attachments, linked files, @-mentions, open-files sync; token counter; persisted history; one-click summarization; runtime agent picker |
| U3 | **Quick refactor** | Selection + instruction by hotkey; custom-instructions library; separate agent; optional tools; streamed result inserted into the editor |
| U4 | **Tools** | read/create/edit file, search, find, list, build, diagnostics, terminal, todo, load_skill; per-tool enable |
| U5 | **Skills** | discovery from `.qodeassist/skills`, `.claude/skills`, `~/.claude/skills`; auto-injection, explicit `/` picker, always-on |
| U6 | **MCP** | server mode (expose plugin tools, HTTP/SSE + stdio bridge) and client hub (consume external tools in chat/refactor) |
| U7 | **Providers** | 13 `client_api` types over one GenericProvider; secrets store; local-server autostart; model listing |
| U8 | **Agents** | TOML profiles: abstract wire-base + thin concrete via `extends`, `[body]` table 1:1 with the wire request (message serialization inlined per base), `match` rules (completion routing), `cache_breakpoints`, per-agent model override, per-pipeline agent selection |
| U9 | **Personas** | persona = the agent's `system_prompt`; shared text lives in plain files pulled in via `read_file` — bundled defaults under `:/roles/…`, or any file the user points at under `${PROJECT_DIR}` / `${CONFIG_DIR}` (your QodeAssist user directory); `read_file` reads the literal path given (no override/fallback resolution); switching persona = switching agent (no separate Roles subsystem) |
| U10 | **Configuration UI** | settings pages for everything above; per-project settings; updater + status widget |
---
## 2. Design principles
1. **One stack.** Every LLM byte — completion, chat, compression, refactor —
flows through the same `Session` pipeline. No parallel legacy path.
2. **Hexagonal core.** The runtime (agents, sessions, providers, templates,
prompt rendering) has zero Qt Creator dependencies. The IDE host composes
that core; IDE-specific facts enter only through ports (document reading,
project scanning, secrets, tool hosting).
3. **Configuration is declarative, code is mechanism.** What is sent (request
`[body]`, system prompt, endpoint, model) lives in TOML/JSON/Jinja and is
user-overridable; *how* it is sent (streaming, retries, tool loop, event
routing) lives in C++ and is identical for all providers.
4. **Agent-driven behavior.** The agent's TOML declares what a conversation
uses (`enable_tools`, `enable_thinking`); features and UI adapt to the
agent config instead of switching on provider names or provider-declared
capability flags.
5. **Single source of truth for conversation state.** `ConversationHistory`
owns the messages; `ChatModel` and persistence are projections of it, never
independent copies.
6. **Per-feature composition roots, no singletons.** Each feature constructs
and owns its dependencies (`new` + parent); shared services are passed
explicitly (constructor/setter, QML context properties for the chat).
7. **Streaming-first event model.** One typed `ResponseEvent` stream is the
only contract between the core and every consumer. Deltas exist for live
UI (chat); one-shot pipelines (completion, refactor) ignore them,
wait for `finished`, and read the final assistant message from history.
8. **Fail at load, not mid-conversation.** Agent profiles are validated when
loaded (partials resolve, assembled body parses as JSON against a synthetic
context), so a config error never surfaces as a silent runtime drop.
---
## 3. Layered model
```mermaid
flowchart TB
subgraph HOSTS["Hosts — composition roots"]
PLUGIN["Qt Creator plugin<br/>qodeassist.cpp"]
end
subgraph L5["L5 · Presentation"]
LSP["LSP bridge<br/>inline suggestions"]
QMLUI["ChatView QML<br/>4 placements"]
RW["Refactor widgets"]
SUI["Settings pages"]
end
subgraph L4["L4 · Features"]
FCOMP["CompletionFeature"]
FCHAT["ChatFeature"]
FREF["RefactorFeature"]
end
subgraph L3["L3 · Capabilities"]
CTX["ContextEngine<br/>ports + QtC adapters"]
TOOLS["ToolKit"]
SKILLS["SkillsEngine"]
MCPH["McpHub<br/>client + server"]
end
subgraph L2["L2 · Core runtime — IDE-independent"]
SM["SessionManager"]
SESS["Session"]
AGF["AgentFactory + AgentRouter"]
AG["Agent"]
PROV["GenericProvider"]
TPL["JsonPromptTemplate"]
end
subgraph L1["L1 · Declarative config"]
PCONF["providers/*.toml"]
ACONF["agents/*.toml + partials/*.jinja"]
ROST["rosters / pipelines"]
PERS["personas/*.md"]
SKCONF["skills/*.md"]
SEC["SecretsStore"]
end
subgraph L0["L0 · Wire — LLMQore"]
CLIENTS["*Client — SSE streaming"]
TOOLFW["Tool framework"]
MCPT["MCP transports"]
end
PLUGIN --> L4
PLUGIN --> SUI
LSP --> FCOMP
QMLUI --> FCHAT
RW --> FREF
FCOMP --> SM
FCHAT --> SM
FREF --> SM
FCOMP --> CTX
FCHAT --> CTX
FREF --> CTX
FCHAT --> SKILLS
FCHAT --> TOOLS
FREF --> TOOLS
TOOLS --> TOOLFW
MCPH --> MCPT
SM --> SESS
SESS --> AG
AGF --> AG
AG --> PROV
AG --> TPL
AGF --> ACONF
AGF --> PCONF
AGF --> SEC
AGF --> ROST
TPL --> PERS
PROV --> CLIENTS
SKILLS --> SKCONF
```
### Layer contracts
| Layer | Contains | May depend on | Must NOT depend on |
|-------|----------|---------------|--------------------|
| **L0 Wire** | LLMQore clients (one per wire protocol: Claude, OpenAI Chat, OpenAI Responses, Google, Ollama, Mistral, llama.cpp), tool framework, MCP transports | Qt Network | anything above |
| **L1 Config** | `ProviderInstance`, `AgentProfile` (+ loader/validator), rosters, personas, skills, secrets port | toml++, inja | Qt Creator, L2+ |
| **L2 Core** | `Agent`, `AgentFactory`, `AgentRouter`, `Provider`/`GenericProvider`, `JsonPromptTemplate`, `Session`, `SessionManager`, `ConversationHistory`, `SystemPromptBuilder`, `ResponseRouter`, `ToolContributorRegistry` | L0, L1 | Qt Creator, QML, features |
| **L3 Capabilities** | `ContextEngine` (ports + QtC adapters), `ToolKit` (built-in tools), `SkillsEngine`, `McpHub` | L0L2, QtC APIs *only in adapters* | features, UI |
| **L4 Features** | `CompletionFeature`, `ChatFeature` (send/stream, compression, token counting, file edits), `RefactorFeature` | L2, L3 | each other |
| **L5 Presentation** | LSP bridge, ChatView QML, refactor widgets, settings pages | its feature | core internals |
| **Hosts** | plugin shell | everything (composition only) | — |
The hard rule that makes testability free: **L0L2 build into
targets with no Qt Creator linkage.** Tests link L0L2 directly;
the plugin adds L3 adapters, L4, L5.
---
## 4. Core domain model
Rendered copy: [core-class-diagram.svg](core-class-diagram.svg) (regenerate
when the diagram below changes).
```mermaid
classDiagram
direction TB
class SessionManager {
+acquire(agentName) Session
+release(session)
+toolContributors() ToolContributorRegistry
}
class Session {
+send(blocks)
+cancel()
+history() ConversationHistory
+systemPrompt() SystemPromptBuilder
+event(ResponseEvent)
+finished(id, stopReason)
+failed(id, ErrorInfo)
+cancelled(id)
}
class ConversationHistory {
+messages() vector~Message~
+lastAssistantText() string
+append(Message)
+reset(vector~Message~)
}
class Message {
+role Role
+blocks vector~ContentBlock~
}
class SystemPromptBuilder {
+setLayer(id, text, priority)
+removeLayer(id)
+compose() string
}
class ResponseRouter {
+attach(BaseClient)
+event(ResponseEvent)
}
class Agent {
+config() AgentConfig
+provider() Provider
+promptTemplate() PromptTemplate
}
class AgentFactory {
+create(name) Agent
+configByName(name) AgentConfig
+effectiveModel(name) string
}
class AgentRouter {
+pickAgent(roster, fileCtx) string
}
class Provider {
<<interface>>
+prepareRequest(request, ctx)
+sendRequest(json) RequestID
+cancelRequest(RequestID)
}
class GenericProvider {
-client BaseClient
}
class PromptTemplate {
<<interface>>
+buildFullRequest(request, ctx)
}
class JsonPromptTemplate {
-bodySpec QJsonObject
-env InjaEnvironment
}
class ToolContributorRegistry {
+registerContributor(fn)
+applyTo(ToolsManager)
}
SessionManager o-- Session : pools
SessionManager --> AgentFactory : builds via
SessionManager --> ToolContributorRegistry
Session *-- ConversationHistory
Session *-- SystemPromptBuilder
Session *-- ResponseRouter
Session --> Agent
ConversationHistory o-- Message
Agent *-- Provider
Agent *-- PromptTemplate
AgentFactory ..> Agent : creates
AgentFactory --> AgentRouter
GenericProvider --|> Provider
JsonPromptTemplate --|> PromptTemplate
```
Responsibilities, one line each:
- **Agent** — immutable bundle of *what to call*: resolved config + provider +
compiled prompt template. No request state.
- **Session** — one conversation's runtime: owns history, system-prompt
layers, pinned context providers, response routing, the in-flight request,
and the content of each dispatched request (tool continuations replay it
inside LLMQore; see `context-architecture.md` §4.3).
`send(blocks)` is the *only* entry point: every pipeline appends a user
message and dispatches; there are no per-pipeline send variants. What
differs between completion, chat, and refactor is the agent's template and
the consumption mode (deltas vs final message), never the Session API.
- **SessionManager** — creates/pools sessions per agent; the single place
features go to get one. Pooling (not per-message construction) covers the
"fresh agent + provider + secrets read per request" latency cost. It reuses
only the expensive parts (agent, provider, compiled template, secrets read):
`acquire` hands out a session with cleared history and system-prompt
layers, so one-shot pipelines never see a previous exchange.
- **AgentRouter** — the agent picker for *auto-routed* pipelines. Only code
completion routes by context: `pickAgent(roster.codeCompletion, {file,
project})` walks the ordered roster and returns the first agent whose match
rules fit. Chat is user-driven (the picker filters to the `chatAssistant`
allow-list; the user chooses); compression and quick refactor each use a
single configured agent. No feature-local routing logic beyond these.
- **GenericProvider** — one class for all 13 client APIs; varies only by
LLMQore client factory + metadata. Request *shape* belongs to the template,
never to the provider.
- **JsonPromptTemplate** — compiles the agent's `[body]` table; renders
Jinja-bearing string values, splices raw JSON, drops empty keys; validated
at load time.
- **SystemPromptBuilder** — ordered named layers (`agent.system`,
`chat.context`, `refactor`, `compression`); features mutate only their own
layer.
- **ResponseRouter / ResponseEvent** — adapts LLMQore client signals into one
typed stream: `TextDelta`, `ThinkingDelta`, `ToolCallStart/End`,
`ToolResult`, `Usage`, `Error`, `MessageStop`.
- **ToolContributorRegistry** — contributors (built-in ToolKit, SkillTool,
McpHub) register once; `SessionManager` applies them to every new session's
`ToolsManager`. This is how MCP tools reach chat *and* refactor (U6) without
feature code knowing about MCP.
---
## 5. Runtime flows
### 5.1 Chat (U2) — the richest path
```mermaid
sequenceDiagram
autonumber
actor U as User
participant V as ChatView QML
participant F as ChatFeature
participant SM as SessionManager
participant S as Session
participant T as JsonPromptTemplate
participant P as GenericProvider
participant C as LLMQore Client
participant R as ResponseRouter
U->>V: message + attachments
V->>F: sendMessage(text, files, images)
F->>SM: acquire(activeAgent)
SM-->>F: Session (pooled)
F->>S: systemPrompt().setLayer("chat.context", project + skills + linked files)
F->>S: send(userBlocks)
S->>T: buildFullRequest(history, system, ctx)
T-->>S: request JSON (body is 1:1 with the API)
S->>P: sendRequest(json)
P->>C: HTTP POST, SSE stream
loop streaming
C-->>R: chunk / thinking / tool_use / usage
R-->>S: ResponseEvent
S-->>F: event(ResponseEvent)
F-->>V: ChatModel projection update
end
opt tool call requested
S->>S: execute tool via ToolsManager
S->>P: continue with tool_result
end
C-->>R: finalized
R-->>S: MessageStop + Usage
S-->>F: finished()
F->>SM: release(session)
```
State ownership in chat: `Session.history()` is the truth. `ChatModel` is a
QML projection built from history events (`messageAdded`, `messageUpdated`);
`ChatSerializer`/`ChatHistoryStore` persist *history*, and restoring a chat
seeds a new session's history — never the other way around. File-edit blocks,
apply/undo, and the token counter are ChatFeature concerns layered on the
event stream.
### 5.2 Completion (U1)
```
LSP getCompletionsCycling
→ CompletionFeature
agent = AgentRouter.pickAgent(roster.codeCompletion, {file, project})
session = SessionManager.acquire(agent)
ctx = ContextEngine: prefix/suffix + open-files context (policy from
CodeCompletionSettings — editor policy, not agent config)
session.send(blocks{completion context})
on finished → history().lastAssistantText()
→ CodeHandler (output-mode post-processing) → LSP items
```
No special Session method: the completion context travels as the content of
an ordinary user message (a structured block carrying prefix/suffix + file
context), and the template context exposes it as `ctx.prefix` / `ctx.suffix`.
FIM vs instruct is *agent config* (template + body), not feature code: a FIM
agent's body renders `prefix`/`suffix` into FIM fields; an instruct agent's
body renders the same exchange as a chat-shaped request. The feature is
identical for both — and since completion has no incremental UI, it never
touches the delta stream: it waits for `finished` and reads the last message.
### 5.3 Quick refactor (U3)
```
Hotkey → RefactorFeature
agent = pipelines.quickRefactor (single configured agent)
session = SessionManager.acquire(agent)
session.systemPrompt().setLayer("refactor", tagged selection + output rules)
session.send(blocks{instruction})
on finished → history().lastAssistantText()
→ ResponseCleaner → RefactorResult → editor insert (accept/reject)
```
Same consumption mode as completion: the feature listens to
`Session::finished`/`failed` only (events at most drive a progress spinner
and cancel) and reads the result from history — it never connects to raw
client signals. Tool calls during refactor run inside the session's tool
loop; history's last assistant message is whatever the model produced after
the final tool round.
### 5.4 Compression (U2)
Compression is ChatFeature reusing the same path with the single
`pipelines.chatCompression` agent and a `"compression"` system layer; the
summary starts a new history.
---
## 6. Configuration model
```mermaid
erDiagram
AGENT_PROFILE ||--o| AGENT_PROFILE : extends
AGENT_PROFILE }o--|| PROVIDER_INSTANCE : provider_instance
AGENT_PROFILE }o--o{ PARTIAL : includes
AGENT_PROFILE }o--o{ PERSONA : read_file
ROSTER }o--o{ AGENT_PROFILE : ranks
MODEL_OVERRIDE |o--|| AGENT_PROFILE : overrides_model
PROVIDER_INSTANCE }o--|| CLIENT_API : client_api
PROVIDER_INSTANCE }o--o| SECRET : api_key_ref
PROVIDER_INSTANCE ||--o| LAUNCH_CONFIG : autostarts
AGENT_PROFILE {
string name
bool abstract
string system_prompt "jinja; inline text or read_file()"
json body "request body, 1:1 with API"
string endpoint "may contain MODEL placeholder"
string model "default; override wins"
bool enable_tools "capability hint"
bool enable_thinking "capability hint"
json match "file, path, project patterns"
}
PROVIDER_INSTANCE {
string name
string client_api
string url
string api_key_ref
}
PERSONA {
string path "plain markdown file"
}
ROSTER {
string pipeline "completion, chat, compression, refactor"
list agents "ordered candidates"
}
```
Rules of the config layer (full spec: `agent-templates-design.md`):
- `[body]` **is** the request body — field-by-field, deep-mergeable through
`extends`; Jinja-bearing strings render and splice as raw JSON, literals
pass through. No separate sampling/thinking merge machinery.
- Message serialization is inlined in each abstract **wire base**; there are no
bundled partials. `{% include %}` still resolves sandboxed roots (bundled
`:/agents/`, then the user agent's dir) for user-supplied partials; a missing
partial is a load-time error.
- Two-level hierarchy: an abstract **wire base** per provider (provider +
endpoint + serialization only — no model/persona/tags/sampling) and a thin
concrete agent carrying all policy.
- Per-agent model override lives in `agent_models.json` and is applied by
`AgentFactory`; `${MODEL}` in `endpoint` covers URL-model providers.
- Personas are not a subsystem: the profile's `system_prompt` is the persona.
Shared text lives in plain markdown under the sandboxed roots and is pulled
in with `{{ read_file(...) }}`; a persona-switch is an agent-switch — the
only system-prompt edit point is the profile.
- Secrets never appear in TOML; `api_key_ref` resolves through the
`SecretsStore` port (QtC keychain in the plugin).
---
## 7. Capabilities layer
**ContextEngine** replaces the monolithic ContextManager with three focused
services behind IDE-agnostic ports:
| Service | Port (L2-visible) | QtC adapter |
|---------|-------------------|-------------|
| `EditorContext` — current doc, selection, prefix/suffix | `IDocumentReader` | TextEditor API |
| `ProjectContext` — root, file listing, ignore filtering (`.qodeassistignore`), open files, changes | `IProjectScanner` | ProjectExplorer API |
| `TokenEstimator` — input estimates, calibrated by server usage | — (pure) | — |
**ToolKit** registers the built-in tools (U4) with the
`ToolContributorRegistry`; each tool declares a permission class (read /
write / execute) so per-tool enablement (settings) and confirmation policy
(terminal commands) live in one place.
**SkillsEngine** (U5): discovery + watching of the three skill roots; exposes
`catalogText()` (names + descriptions for the system prompt),
`alwaysOnBodies()`, and the `load_skill` tool; the `/` picker injects a
skill's body into a single message.
**McpHub** (U6): client side connects configured servers and contributes
their tools through the same registry (tools reach every session uniformly);
server side exposes ToolKit over HTTP/SSE + stdio bridge.
---
## 8. Cross-cutting policies
Architecture is the rules as much as the boxes. These policies bind every
layer and are part of the contract:
### 8.1 Threading
The core runs on the GUI thread; concurrency is the Qt event loop plus async
network I/O — no shared-state threading anywhere in L1L4. Work that can
block (project scans, token estimation over large trees) hides behind L3
ports; an adapter may use worker threads internally but delivers results as
queued signals. Core types are therefore deliberately not thread-safe.
### 8.2 Request lifecycle
A session has at most one in-flight request; `send()` while in flight cancels
the previous request first. Every request terminates in exactly one of three
states — `finished(stopReason)`, `failed(error)`, `cancelled()` — and
cancellation is *not* an error: no consumer may string-match a message to
tell them apart.
### 8.3 Errors
Runtime errors are typed, not strings: `ErrorInfo { category, message,
providerDetail }` with categories `Config | Auth | Network | Provider |
Validation | Tool`. The category drives UI affordances (Auth → open provider
settings, Network → offer retry); free text is for logs only. Load-time
errors (principle 8) surface in the agents settings page, never as a failed
send.
### 8.4 Timeouts and retries
Transfer timeouts are per-pipeline policy (completion short, chat/refactor
from settings), applied by the feature — never baked into agent profiles. A
streaming request is never silently retried after the first byte; automatic
retry with capped backoff is allowed only for connection-phase failures.
Anything beyond that is an explicit user action.
### 8.5 Observability
One `RequestID` correlates feature → session → provider → client → events →
logs. Each layer logs under its own category (`qodeassist.session`,
`qodeassist.provider`, `qodeassist.tools`, …); request bodies are logged only
at debug level, and secrets are redacted unconditionally. `Usage` events are
the single source feeding the token counter, `TokenEstimator` calibration,
and the performance log.
### 8.6 Config compatibility
Agent profiles carry a `schema_version`; the loader migrates old user
configs forward or rejects them with an actionable message — silent
reinterpretation is forbidden. Bundled profiles are read-only resources that
user profiles shadow by name. Persisted chat history is versioned the same
way.
### 8.7 Security
Secrets exist only behind the `SecretsStore` port; they never reach TOML,
logs, or persisted chats. Tool permission classes (read / write / execute)
centralize the confirmation policy. The MCP server is opt-in and binds
loopback by default; skill and partial roots are sandboxed — nothing resolves
outside its declared directory.
### 8.8 Testing
The test pyramid follows the layers:
| Layer | Strategy |
|-------|----------|
| L1 | loader/validator unit tests; golden-file snapshots of every bundled profile's rendered body against a synthetic context — the same check as load-time validation, run in CI |
| L2 | `Session` / `ResponseRouter` replay tests over recorded SSE fixtures per provider; fake `BaseClient`, no network |
| L3 | contract tests against the ports; QtC adapters covered only by plugin integration |
Layering is enforced mechanically, not by review: each layer is its own
CMake target, and the core targets do not link Qt Creator — a violating
include fails the build.
---
## 9. Module / target layout
```
core/ # no Qt Creator linkage — tests link this
config/ # L1: ProviderInstance, AgentProfile, loaders,
# validators, rosters, personas, secrets port
providers/ # L2: Provider, GenericProvider, ProviderFactory,
# ClaudeCacheControl
prompt/ # L2: JsonPromptTemplate, ContextRenderer, partials
agents/ # L2: Agent, AgentFactory, AgentRouter
session/ # L2: Session, SessionManager, ConversationHistory,
# SystemPromptBuilder, ResponseRouter, events
skills/ # L3 (IDE-free part): SkillsEngine, loaders
ide/ # Qt Creator adapters only
context/ # EditorContext, ProjectContext adapters, ignore
tools/ # built-in ToolKit (build, issues, editor edits…)
mcp/ # McpHub managers
features/
completion/ # LSP bridge + CompletionFeature + CodeHandler
chat/ # ChatFeature: ClientInterface, ChatModel(projection),
# Compressor, TokenCounter, FileEditController,
# serializer/store
refactor/ # RefactorFeature + custom instructions
ui/
ChatView qml/, widgets/, settings pages
hosts/
plugin/ # qodeassist.cpp — composition root, actions, panes
tests/
config/ # loader cases + golden rendered-body snapshots
session/ # SSE replay fixtures per provider, fake client
external/
llmqore/ inja/ tomlplusplus/
```
Dependency direction is strictly downward in the table of §3; `features/*`
never include each other; `ui/*` talks only to its feature; `hosts/*` are the
only places allowed to know about everything.
---
## 10. Deltas from the current working tree
What "from scratch" changes relative to today's code — the migration
checklist to call the architecture done:
1. **Stack A physical teardown** — delete root `providers/*`,
`pluginllmcore/*`, `ConfigurationManager`, legacy provider/model/template
settings pages, and the Stack A registration + MCP loop in
`qodeassist.cpp`. Runtime already has no consumers.
2. **Single history owner** — make `ChatModel` a projection of
`Session::history()` (subscribe to history signals) instead of a parallel
message store with seed-on-send; `ChatCompressor` reads history, not the
model.
3. **Single send path** — delete `Session::sendCompletion(ContextData)`;
the completion context becomes user-message content sent through the one
`send()` (the completion handler already reads its result from history's
last message). Move `QuickRefactorHandler` off raw `BaseClient` signals
(`requestCompleted`/`requestFinalized`/`requestFailed`) onto
`Session::finished`/`failed` + `history().lastAssistantText()`.
4. **Three-state request lifecycle** — add `cancelled` to `Session`; today
`cancel()` emits `failed(id, "Cancelled by user")` and consumers must
string-match to tell cancellation from failure (§8.2).
5. **Typed errors** — replace `lastError` strings and the `failed(QString)`
payload with `ErrorInfo` categories (§8.3).
6. **Agent selection by pipeline shape** — completion is the only context-routed
pipeline (`AgentRouter.pickAgent(roster.codeCompletion, {file, project})`);
chat picker filters to the `chatAssistant` allow-list; quick refactor and
compression each read a single configured agent (no routing).
7. **MCP tools on session clients** — register MCP-contributed tools through
`ToolContributorRegistry` so chat/refactor sessions get them (today they
are registered only on dead Stack A providers).
8. **Session pooling**`SessionManager.acquire/release` with a small pool
per agent, replacing per-message agent + provider + secrets construction.
9. **ContextManager split** — extract `EditorContext` / `ProjectContext` /
`TokenEstimator` behind ports; move QtC API use into `ide/context`.
10. **`[body]` model completion** — finish `agent-templates-design.md`
(body-table rendering, sandboxed `include`, load-time validation, model
override + `${MODEL}`, `schema_version` gate), delete sampling/thinking
merge machinery.
11. **Message type unification** — one `Message`/`ContentBlock` shape from
history to QML (roles, text, thinking, tool use/result, images); delete
the parallel `ChatModel::Message` struct.
12. **Test scaffolding** — golden rendered-body snapshots + SSE replay
fixtures (§8.8); CI builds the core targets without Qt Creator so a
layering violation fails the build.
13. **Stale docs cleanup**`project-rules.md` describes the removed Rules
system; mark or delete.