doc: update architecture

This commit is contained in:
Petr Mironychev
2026-06-11 15:28:37 +02:00
parent 69672deb45
commit 231a6a0215

View File

@@ -1,35 +1,42 @@
# QodeAssist Architecture # QodeAssist Architecture
This document describes the runtime architecture of QodeAssist after the This document describes the **current** runtime architecture, after the §10
migration of all LLM runtime paths onto the agent / `Session` stack rework in `target-architecture.md` was completed. Every runtime LLM path —
("Stack B"). Every runtime LLM path — code completion, chat (send/stream + code completion, chat (send/stream + compression + token counting), and quick
compression + token counting), and quick refactor — now goes through agents, refactor — flows through one stack: agents, `Session`, and the
`Session`, and the `Providers::GenericProvider` layer. `Providers::GenericProvider` layer. There is no legacy parallel path; the old
"Stack A" (root `providers/*`, `pluginllmcore/*`, `ConfigurationManager`, the
provider/model/template settings pages) has been removed.
> Legend: ✅ = on Stack B (active runtime), 🔴 = legacy Stack A (isolated, no For the design rationale, layering contract, and cross-cutting policies, see
> runtime consumers left). [`target-architecture.md`](target-architecture.md). This file documents how the
code is wired today.
--- ---
## 1. Top level: ownership and dependency injection ## 1. Top level: ownership and dependency injection
The plugin (`qodeassist.cpp`) owns everything via `new` + parent (no plugin-wide The plugin (`qodeassist.cpp`) owns everything via `new` + parent — no
singletons; each feature receives its dependencies explicitly). plugin-wide singletons; each feature receives its dependencies explicitly.
``` ```
QodeAssistPlugin QodeAssistPlugin
Stack B infrastructure: • Providers::registerBuiltinProviders() — client_api → provider table
• Providers::registerBuiltinProviders() — registers 13 client_api types • ProviderInstanceFactory — provider instances from TOML
• ProviderInstanceFactory14 instances from TOML • ProviderSecretsStore secrets behind a port
ProviderSecretsStore AgentFactory — agents from TOML + agent_models.json
AgentFactory — agents from TOML SessionManager(agentFactory) — owns the ToolContributorRegistry
• SessionManager(agentFactory) toolContributors().add(registerQodeAssistTools)
toolContributors().add(registerSkillTool)
toolContributors().add(McpClientsManager::registerToolsOn)
• m_engine (QQmlEngine) • m_engine (QQmlEngine)
rootContext: "agentFactory", "sessionManager" — DI for chat (QML) rootContext: "agentFactory", "sessionManager" — DI for chat (QML)
Wired into consumers: Wired into consumers:
• QodeAssistClient ← LLMClientInterface(*sessionManager, *agentFactory) • QodeAssistClient ← LLMClientInterface(generalSettings, completeSettings,
← setSessionManager / setAgentFactory (for quick refactor) agentFactory, sessionManager, documentReader,
performanceLogger)
← setSessionManager / setAgentFactory (quick refactor)
``` ```
Chat lives in QML (`ChatRootView` is a `QML_ELEMENT`), so `AgentFactory` and Chat lives in QML (`ChatRootView` is a `QML_ELEMENT`), so `AgentFactory` and
@@ -39,220 +46,262 @@ context** and resolved in `ChatRootView` via
--- ---
## 2. Stack B core (agent / Session) ## 2. Core (agent / Session)
``` ```
AgentFactory.create(name) AgentFactory.create(name)
configByName(name) → AgentConfig (TOML) configByName(name) → AgentConfig (TOML, [body] table; model override from
providerInstance, model, endpoint, role, messageFormat, agent_models.json applied here)
sampling, enableTools, enableThinking, match{filePatterns,...}
buildProviderForAgent: buildProviderForAgent:
instance = ProviderInstanceFactory.instanceByName(cfg.providerInstance) instance = ProviderInstanceFactory.instanceByName(cfg.providerInstance)
provider = ProviderFactory::create(instance.clientApi) ◄── keystone provider = ProviderFactory::create(instance.clientApi)
provider.setUrl(instance.url) provider.setUrl(instance.url)
provider.setApiKey(secrets.read(instance.apiKeyRef)) provider.setApiKey(secrets.read(instance.apiKeyRef))
Agent(config, provider) Agent(config, provider)
promptTemplate = JsonPromptTemplate::fromConfig(cfg.messageFormat) (inja) promptTemplate = JsonPromptTemplate::fromConfig(cfg) — compiles [body] (inja),
validated at load against a synthetic context
provider.setPromptCaching(cfg.cachePrompt, cfg.cacheTtl == "1h")
SessionManager.createSession(agentName) → Session(agent) SessionManager — two ways to obtain a Session:
├─ ConversationHistory — messages as ContentBlocks • createSession(agentName, externalHistory?) — chat: attaches a persistent,
├─ SystemPromptBuilder — layers: agent.role + caller layers externally-owned history
└─ ResponseRouter(client) — emits ResponseEvent • acquire(agentName) / release(session) — one-shot pipelines: a small
per-agent pool of internal-history
sessions; acquire hands out a
session with cleared history,
cleared system-prompt layers and
cleared client tools
Session(agent[, externalHistory])
├─ ConversationHistory — messages as polymorphic ContentBlocks
├─ SystemPromptBuilder — ordered named layers (priority-sorted)
└─ ResponseRouter(client) — adapts client signals → typed ResponseEvent
Session API: Session API:
• send(blocks, toolsOverride) — chat/refactor: append user msg + dispatch • send(blocks, toolsOverride) — the ONLY dispatch entry point: append a user
• sendCompletion(ContextData) — completion: FIM prefix/suffix message and dispatch. Completion/chat/refactor
• client() — agent's LLMQore::BaseClient (direct streaming) differ only in block content + template.
systemPrompt()->setLayer(...) — dynamic context layers cancel() — tears down in-flight; emits cancelled(id)
supportsImages() — provider Image capability history() / systemPrompt() / client() / supportsImages()
history() — for seeding from ChatModel setContentLoader(loader) — resolves Stored* attachment/image blocks
• lastError() → ErrorInfo — typed synchronous start-failure detail
Session signals (three-state, mutually exclusive per request):
• finished(id, stopReason)
• failed(id, ErrorInfo{category, message, providerDetail})
• cancelled(id)
+ event(ResponseEvent) — live delta stream for the chat UI
``` ```
`Session::sendCompletion` and `dispatch` compose `SystemPromptBuilder` layers `Session::dispatch` renders the agent's `system_prompt` into the `agent.system`
(`agent.role` + caller-provided) into the request system prompt. layer, composes all `SystemPromptBuilder` layers into the request system prompt,
and substitutes `${MODEL}` in the endpoint before sending.
--- ---
## 3. Provider layer — the keystone (implemented during migration) ## 3. Provider layer
The Stack B provider layer previously existed only as an abstract base + One configuration-driven `GenericProvider` covers every API; it varies only by
empty factory (`registerType` was never called, no concrete providers). This the LLMQore client factory and metadata. Request *shape* belongs to the agent's
blocked every agent from obtaining a working provider. It is now implemented `JsonPromptTemplate` (the `[body]` table), never to the provider.
via a single configuration-driven `GenericProvider`.
``` ```
ProviderFactory (sources/providers, namespace functions) ProviderFactory (sources/providers, namespace functions)
registerType(name, fn) / create(name, parent) / knownNames() registerType(name, fn) / create(name, parent) / knownNames()
registerBuiltinProviders() — client_api → provider table
│ registerBuiltinProviders() — client_api → provider table
GenericProvider : Providers::Provider GenericProvider : Providers::Provider
• owns an LLMQore::BaseClient (created by a ClientFactory) • owns an LLMQore::BaseClient (created by a ClientFactory)
• prepareRequest — inherited from Provider base: • prepareRequest → PromptTemplate::buildFullRequest; injects tools when
delegates to PromptTemplate::buildFullRequest enable_tools; applies ClaudeCacheControl when prompt caching is on
• client() / providerID() / capabilities() / getInstalledModels() • client() / providerID() / capabilities() / getInstalledModels()
``` ```
### client_api → provider table ### client_api → provider table
| client_api | LLMQore client | ProviderID | capabilities | | client_api | LLMQore client | ProviderID | capabilities |
|--------------------------------|-------------------------|------------------|-------------------------| |------------------------------|-----------------------|------------------|-----------------------------------|
| Claude | ClaudeClient | Claude | Tools·Thinking·Image·ModelListing | | Claude | ClaudeClient | Claude | Tools·Thinking·Image·ModelListing |
| Google AI | GoogleAIClient | GoogleAI | Tools·Thinking·Image·ModelListing | | Google AI | GoogleAIClient | GoogleAI | Tools·Thinking·Image·ModelListing |
| llama.cpp | LlamaCppClient | LlamaCpp | Tools·Thinking·Image·ModelListing | | llama.cpp | LlamaCppClient | LlamaCpp | Tools·Thinking·Image·ModelListing |
| Mistral AI | MistralClient | MistralAI | Tools·Thinking·Image·ModelListing | | Mistral AI | MistralClient | MistralAI | Tools·Thinking·Image·ModelListing |
| Codestral | MistralClient | MistralAI | Tools·Image | | Codestral | MistralClient | MistralAI | Tools·Image |
| Ollama (Native) | OllamaClient | Ollama | Tools·Thinking·Image·ModelListing | | Ollama (Native) | OllamaClient | Ollama | Tools·Thinking·Image·ModelListing |
| Ollama (OpenAI-compatible) | OpenAIClient | OpenAICompatible | Tools·Thinking·Image·ModelListing | | Ollama (OpenAI-compatible) | OpenAIClient | OpenAICompatible | Tools·Thinking·Image·ModelListing |
| OpenAI (Chat Completions) | OpenAIClient | OpenAI | Tools·Thinking·Image·ModelListing | | OpenAI (Chat Completions) | OpenAIClient | OpenAI | Tools·Thinking·Image·ModelListing |
| OpenAI (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing | | OpenAI (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing |
| OpenAI Compatible | OpenAIClient | OpenAICompatible | Tools·Image·Thinking | | OpenAI Compatible | OpenAIClient | OpenAICompatible | Tools·Image·Thinking |
| OpenRouter | OpenAIClient | OpenRouter | Tools·Image·Thinking·ModelListing | | OpenRouter | OpenAIClient | OpenRouter | Tools·Image·Thinking·ModelListing |
| LM Studio (Chat Completions) | OpenAIClient | LMStudio | Tools·Thinking·Image·ModelListing | | LM Studio (Chat Completions) | OpenAIClient | LMStudio | Tools·Thinking·Image·ModelListing |
| LM Studio (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing | | LM Studio (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing |
Request *shape* comes from the agent's prompt template (jinja `messageFormat`),
so a single provider class covers every API by varying only the client factory
and metadata.
--- ---
## 4. Runtime paths (all on Stack B) ## 4. Configuration model
### 4a. Code completion ✅
```
Qt Creator LSP (getCompletionsCycling)
LLMClientInterface
pickCompletionAgent: AgentRouter.pickAgent(roster.codeCompletion, {file, project})
session = sessionManager.createSession(agent)
ctx = Templates::ContextData{ prefix, suffix,
systemPrompt = fileContext + openFiles }
session.sendCompletion(ctx)
▼ stream from session.client():
requestCompleted → sendCompletionToClient → CodeHandler → LSP
system prompt = agent.role; FIM template renders prefix/suffix
```
### 4b. Chat ✅
```
ChatRootView (QML)
resolve agentFactory()/sessionManager() = qmlEngine(this)->rootContext()
ChatAgentController: agent list (configNames), active agent (persisted),
supportsThinking/Tools
QML agent picker (TopBar.agentSelector) — replaced provider/model/template combos
▼ dispatchSend
ClientInterface
session = sessionManager.createSession(currentChatAgent)
registerQodeAssistTools(session.client().tools()) + registerSkillTool
systemPrompt layer "chat.context" = project info + skills + linked files
seedHistory(session.history() ← ChatModel: user/assistant/tool-call+result)
session.send(userBlocks{text + images}, useTools)
▼ stream from session.client() → existing handlers → ChatModel:
chunk→addMessage thinking→addThinkingBlock
tool→addToolExecutionStatus / updateToolResult
finalized→usage completed→messageReceivedCompletely → removeSession
ChatCompressor → createSession(agent) → seed history → layer "compression" → send(prompt)
InputTokenCounter → estimate without provider (calibrated by server usage)
```
### 4c. Quick refactor ✅
```
QodeAssistClient.requestQuickRefactor → QuickRefactorHandler (setSessionManager/setAgentFactory)
pickRefactorAgent: AgentRouter.pickAgent(roster.quickRefactor, {file, project})
session = createSession(agent)
if useTools: registerQodeAssistTools(session.client().tools())
systemPrompt layer "refactor" = buildSystemPrompt(tagged content +
output requirements + indentation rules)
session.send(blocks{instructions}, useTools)
▼ stream from session.client():
requestCompleted → ResponseCleaner → RefactorResult → insert into editor
```
---
## 5. Configuration sources
``` ```
~/.config/.../qodeassist/config/ ~/.config/.../qodeassist/config/
providers/*.toml → ProviderInstance { name, client_api, url, api_key_ref } providers/*.toml → ProviderInstance { name, client_api, url, api_key_ref }
agents/*.toml → AgentConfig { providerInstance, model, endpoint, role, agents/*.toml → AgentConfig { schema_version, providerInstance, model,
messageFormat, sampling, match, enable* } endpoint, system_prompt, [body], match,
enable_tools, enable_thinking, cache_prompt,
extends, abstract, hidden, tags }
agent_models.json → per-agent model override (applied by AgentFactory)
agent_roles/*.json → role text, pulled into system_prompt via {{ agent_role(id) }}
pipelines rosters → codeCompletion / chatAssistant / chatCompression / quickRefactor pipelines rosters → codeCompletion / chatAssistant / chatCompression / quickRefactor
consumed by AgentRouter.pickAgent(roster, {filePath, projectName}) consumed by AgentRouter.pickAgent(roster, {filePath, projectName})
Editor policy (NOT agent config): Editor policy (NOT agent config):
CodeCompletionSettings — triggers, modelOutputHandler, context extraction, CodeCompletionSettings — triggers, modelOutputHandler, context extraction,
useOpenFilesContext useOpenFilesContext
(sampling / prompt-generation fields removed) ```
`[body]` **is** the request body (deep-mergeable through `extends`; Jinja-bearing
string values render and splice as raw JSON, literals pass through, empty renders
drop the key). `include` resolves only sandboxed partial roots. Profiles validate
at load: a referenced partial must resolve and the assembled body must parse as
JSON against a synthetic context — config errors surface in the agents settings
page, never as a silent runtime drop. Full spec:
[`agent-templates-design.md`](agent-templates-design.md).
---
## 5. Runtime paths
`AgentRouter.pickAgent(roster, {file, project})` is the only agent picker; every
pipeline resolves its agent through a roster.
### 5a. Code completion
```
Qt Creator LSP (getCompletionsCycling)
LLMClientInterface
agent = AgentRouter.pickAgent(roster.codeCompletion, {file, project})
session = sessionManager.acquire(agent) — pooled
systemPrompt layer "completion.context" = fileContext + open-files context
session.send( blocks{ CompletionContent(prefix, suffix) }, tools=off )
▼ on Session::finished:
history().lastAssistantText() → CodeHandler (output-mode) → LSP items
→ sessionManager.release(session)
```
The completion context travels as a `CompletionContent` block; the template
exposes it as `ctx.prefix` / `ctx.suffix`. FIM vs instruct is purely agent
config (the body), not feature code. Completion never touches the delta stream —
it waits for `finished` and reads the last message.
### 5b. Chat
`ChatRootView` owns one persistent `ConversationHistory` for the whole chat view
and injects it into every collaborator. **History is the single source of truth.**
```
ChatRootView (QML) — owns ConversationHistory m_history
ChatModel.setHistory(m_history) — ChatModel is a PROJECTION:
subscribes to messageAdded/Updated/cleared/reset, flattens blocks→rows,
overlays file-edit status from ChangesManager, holds a per-message usage map
ChatAgentController — agent list filtered to the
chatAssistant roster; active agent persisted
▼ dispatchSend
ClientInterface
session = sessionManager.createSession(activeAgent, m_history)
sessionManager.toolContributors().contribute(client.tools()) — builtin+skills+MCP
session.setContentLoader(ChatSerializer::loadContentFromStorage)
systemPrompt layer "chat.context" = project info + skills + linked files
session.send( blocks{ TextContent + StoredAttachmentContent + StoredImageContent } )
▼ consumes Session signals (NOT raw client signals):
event(Usage) → ChatModel.setMessageUsage + token-counter calibration
finished(id) → ChangesManager.applyPendingEditsForRequest + persist;
removeSession (the persistent history survives)
failed(id, ErrorInfo) → surface error; removeSession
ChatCompressor → acquire(chatCompression-roster agent) → seed history from the
chat's messages → "compression" layer → send → read summary from
the compression session's own history → release
InputTokenCounter → estimates over ConversationHistory (calibrated by Usage events)
ChatSerializer → persists ConversationHistory via MessageSerializer (v0.3);
imports legacy v0.1/v0.2 files
```
`ChatModel`'s QML role surface (roleType / content / attachments / images /
isRedacted / token roles) is unchanged, so the QML delegates were untouched. The
projection's incremental updates avoid model resets on the streaming hot path.
### 5c. Quick refactor
```
QodeAssistClient.requestQuickRefactor → QuickRefactorHandler
agent = AgentRouter.pickAgent(roster.quickRefactor, {file, project})
session = sessionManager.acquire(agent)
if useTools: sessionManager.toolContributors().contribute(client.tools())
systemPrompt layer "refactor" = tagged selection + output + indentation rules
session.send(blocks{instructions}, useTools)
▼ on Session::finished:
history().lastAssistantText() → ResponseCleaner → RefactorResult → editor insert
→ sessionManager.release(session)
on Session::failed(ErrorInfo) → RefactorResult{error}
``` ```
--- ---
## 6. Remaining Stack A (runtime does NOT depend on it) ## 6. Context layer
The context services sit behind IDE-agnostic ports; Qt Creator API use lives in
the adapters.
``` ```
🔴 Settings UI: provider/model/template selection pages EditorContext — IDocumentReader (port) ← DocumentReaderQtCreator (TextEditor API)
(ccProvider / caProvider / qrProvider) + ConfigurationManager ProjectContext — IProjectScanner (port) ← ProjectScannerQtCreator (ProjectExplorer
→ use ProvidersManager + Core::DocumentModel + the IgnoreManager for .qodeassistignore)
🔴 root providers/* (PluginLLMCore::Provider, 14 classes) TokenEstimator — TokenUtils (pure) ← InputTokenCounter (thin UI consumer)
→ read only chat/quick-refactor sampling settings
🔴 pluginllmcore/* (ProvidersManager, PromptTemplateManager, ResponseCleaner,
PromptProviderChat/Fim, ContextData)
🔴 qodeassist.cpp:144-146 registerProviders() / registerTemplates() (Stack A registration)
🔴 qodeassist.cpp:185 MCP skill-tool loop on Stack A providers (effectively dead)
🔴 ChatAssistantSettings / QuickRefactorSettings — sampling fields (read only by root providers)
ResponseCleaner (pluginllmcore) is still used by QuickRefactorHandler as a text
utility — orthogonal to the provider stack.
``` ```
### Removed during the migration `ContextManager` is now Qt-Creator-free: it delegates open-file enumeration and
ignore filtering to an injected `IProjectScanner` (defaulting to the QtC adapter),
- Rules subsystem (`RulesLoader` + chat "active rules" UI + QuickRefactor rules block) and keeps only filesystem reads + formatting. `ContextManager::shouldIgnore(path)`
- `ChatConfigurationController`, `AgentRoleController` (chat config/role presets) replaced the previously exposed `ignoreManager()`.
- `m_promptProvider` (`PromptProviderFim`) in the plugin
- `RequestType::CodeCompletion` branch in all 14 root providers
- Sampling / prompt-generation fields in `CodeCompletionSettings`
- ChatView no longer links `PluginLLMCore`
--- ---
## 7. Dependency summary ## 7. Cross-cutting
``` - **Request lifecycle** — a session has at most one in-flight request; `send()`
┌──────────────── Stack B (active runtime) ────────────────┐ while in flight cancels the previous. Every request ends in exactly one of
LLMClientInterface ─┐ │ `finished` / `failed` / `cancelled`. Cancellation is not an error; no consumer
ClientInterface ────┼─► SessionManager ─► Session ─► Agent ─► GenericProvider ─► LLMQore::*Client string-matches a message to tell them apart.
QuickRefactorHandler─┘ │ │ │ │ - **Typed errors** — `ErrorInfo { category ∈ {Config, Auth, Network, Provider,
ChatCompressor ──────────────┘ │ AgentFactory ProviderFactory Validation, Tool}, message, providerDetail }`. `ResponseRouter` categorizes wire
AgentRouter (rosters) │ │ errors (best-effort) at the boundary; `Session::failed` carries the typed value.
ProviderInstanceFactory (TOML) - **Tools** — `SessionManager` owns a `ToolContributorRegistry`; built-in ToolKit,
└──────────────────────────────────────────────────────────┘ the skill tool, and MCP client tools register once and are contributed to chat
and quick-refactor session clients uniformly.
Stack A (settings UI + ConfigurationManager + MCP loop) — isolated, - **Threading** — the core runs on the GUI thread; concurrency is the Qt event
no runtime consumers remain. loop plus async network I/O. Blocking work hides behind L3 ports.
```
--- ---
## 8. Open follow-ups (optional) ## 8. Tests
1. **Chat picker filtering** — show only `chatAssistant`-roster agents (currently `test/` (GTest + Qt::Test) covers the two engines most affected by the rework:
lists all non-hidden agents; the auto-default may land on a FIM agent).
Requires wiring ChatView to `PipelinesConfig` (watch for OBJECT-library - `JsonPromptTemplateTest` — the `[body]` engine: jinja render + JSON splice,
symbol duplication). literal passthrough, empty-render key drop, nested literals, and load-time
2. **MCP tools on agent clients** — MCP skill tools are registered only on Stack A rejection of bodies that render invalid JSON.
providers; to expose MCP tools to chat agents, register them on the session - `ResponseRouterTest` — a fake `BaseClient` replays a recorded provider stream;
client alongside `registerQodeAssistTools`. asserts the assistant message is stamped with the request id, history is built
3. **Physical Stack A teardown** — remove the provider/model/template settings UI, correctly (thinking + text + tool use/result), the typed event stream is emitted,
`ConfigurationManager`, root `providers/*`, `pluginllmcore/*`, and the and wire errors are categorized.
registration + MCP loop in `qodeassist.cpp`. Runtime no longer depends on them.
4. **Per-message session cost** — chat/refactor create a fresh agent/provider/client ---
(and read secrets) per request; a session pool could reduce latency.
## 9. Remaining follow-ups (optional)
1. **Qt-Creator-free core build + CI** — `AgentFactory` / `ContextRenderer` still
call `Core::ICore::userResourcePath`, so the core targets link `QtCreator::Core`.
A `ResourcePaths` port + adapter would let the core build without Qt Creator and
enable a CI job that fails on a layering-violating include, plus golden
rendered-body snapshots over the bundled agents loaded through the real loader.
2. **§9 target module layout** — the `core/ ide/ features/ hosts/` physical target
split in `target-architecture.md` is not yet reflected in the directory layout.
``` ```