Files
QodeAssist/docs/architecture.md
Petr Mironychev f499be278d test: Update tests
2026-06-11 15:38:12 +02:00

16 KiB

QodeAssist Architecture

This document describes the current runtime architecture, after the §10 rework in target-architecture.md was completed. Every runtime LLM path — code completion, chat (send/stream + compression + token counting), and quick refactor — flows through one stack: agents, Session, and the Providers::GenericProvider layer. There is no legacy parallel path; the old "Stack A" (root providers/*, pluginllmcore/*, ConfigurationManager, the provider/model/template settings pages) has been removed.

For the design rationale, layering contract, and cross-cutting policies, see target-architecture.md. This file documents how the code is wired today.


1. Top level: ownership and dependency injection

The plugin (qodeassist.cpp) owns everything via new + parent — no plugin-wide singletons; each feature receives its dependencies explicitly.

QodeAssistPlugin
    • Providers::registerBuiltinProviders()   — client_api → provider table
    • ProviderInstanceFactory                 — provider instances from TOML
    • ProviderSecretsStore                    — secrets behind a port
    • AgentFactory                            — agents from TOML + agent_models.json
    • SessionManager(agentFactory)            — owns the ToolContributorRegistry
        toolContributors().add(registerQodeAssistTools)
        toolContributors().add(registerSkillTool)
        toolContributors().add(McpClientsManager::registerToolsOn)
    • m_engine (QQmlEngine)
        rootContext: "agentFactory", "sessionManager"   — DI for chat (QML)

  Wired into consumers:
    • QodeAssistClient ← LLMClientInterface(generalSettings, completeSettings,
                            agentFactory, sessionManager, documentReader,
                            performanceLogger)
                       ← setSessionManager / setAgentFactory   (quick refactor)

Chat lives in QML (ChatRootView is a QML_ELEMENT), so AgentFactory and SessionManager are exposed as context properties on the engine's root context and resolved in ChatRootView via qmlEngine(this)->rootContext()->contextProperty(...).


2. Core (agent / Session)

AgentFactory.create(name)
  configByName(name) → AgentConfig (TOML, [body] table; model override from
                       agent_models.json applied here)
  buildProviderForAgent:
     instance = ProviderInstanceFactory.instanceByName(cfg.providerInstance)
     provider = ProviderFactory::create(instance.clientApi)
     provider.setUrl(instance.url)
     provider.setApiKey(secrets.read(instance.apiKeyRef))
  ▼
Agent(config, provider)
  promptTemplate = JsonPromptTemplate::fromConfig(cfg)   — compiles [body] (inja),
                   validated at load against a synthetic context
  provider.setPromptCaching(cfg.cachePrompt, cfg.cacheTtl == "1h")
  ▼
SessionManager — two ways to obtain a Session:
  • createSession(agentName, externalHistory?)  — chat: attaches a persistent,
                                                  externally-owned history
  • acquire(agentName) / release(session)       — one-shot pipelines: a small
                                                  per-agent pool of internal-history
                                                  sessions; acquire hands out a
                                                  session with cleared history,
                                                  cleared system-prompt layers and
                                                  cleared client tools
  ▼
Session(agent[, externalHistory])
  ├─ ConversationHistory     — messages as polymorphic ContentBlocks
  ├─ SystemPromptBuilder     — ordered named layers (priority-sorted)
  └─ ResponseRouter(client)  — adapts client signals → typed ResponseEvent

Session API:
  • send(blocks, toolsOverride)   — the ONLY dispatch entry point: append a user
                                    message and dispatch. Completion/chat/refactor
                                    differ only in block content + template.
  • cancel()                      — tears down in-flight; emits cancelled(id)
  • history() / systemPrompt() / client() / supportsImages()
  • setContentLoader(loader)      — resolves Stored* attachment/image blocks
  • lastError() → ErrorInfo       — typed synchronous start-failure detail

Session signals (three-state, mutually exclusive per request):
  • finished(id, stopReason)
  • failed(id, ErrorInfo{category, message, providerDetail})
  • cancelled(id)
  + event(ResponseEvent)          — live delta stream for the chat UI

Session::dispatch renders the agent's system_prompt into the agent.system layer, composes all SystemPromptBuilder layers into the request system prompt, and substitutes ${MODEL} in the endpoint before sending.


3. Provider layer

One configuration-driven GenericProvider covers every API; it varies only by the LLMQore client factory and metadata. Request shape belongs to the agent's JsonPromptTemplate (the [body] table), never to the provider.

ProviderFactory  (sources/providers, namespace functions)
   registerType(name, fn) / create(name, parent) / knownNames()
        ▲  registerBuiltinProviders()   — client_api → provider table
GenericProvider : Providers::Provider
   • owns an LLMQore::BaseClient (created by a ClientFactory)
   • prepareRequest → PromptTemplate::buildFullRequest; injects tools when
     enable_tools; applies ClaudeCacheControl when prompt caching is on
   • client() / providerID() / capabilities() / getInstalledModels()

client_api → provider table

client_api LLMQore client ProviderID capabilities
Claude ClaudeClient Claude Tools·Thinking·Image·ModelListing
Google AI GoogleAIClient GoogleAI Tools·Thinking·Image·ModelListing
llama.cpp LlamaCppClient LlamaCpp Tools·Thinking·Image·ModelListing
Mistral AI MistralClient MistralAI Tools·Thinking·Image·ModelListing
Codestral MistralClient MistralAI Tools·Image
Ollama (Native) OllamaClient Ollama Tools·Thinking·Image·ModelListing
Ollama (OpenAI-compatible) OpenAIClient OpenAICompatible Tools·Thinking·Image·ModelListing
OpenAI (Chat Completions) OpenAIClient OpenAI Tools·Thinking·Image·ModelListing
OpenAI (Responses API) OpenAIResponsesClient OpenAIResponses Tools·Thinking·Image·ModelListing
OpenAI Compatible OpenAIClient OpenAICompatible Tools·Image·Thinking
OpenRouter OpenAIClient OpenRouter Tools·Image·Thinking·ModelListing
LM Studio (Chat Completions) OpenAIClient LMStudio Tools·Thinking·Image·ModelListing
LM Studio (Responses API) OpenAIResponsesClient OpenAIResponses Tools·Thinking·Image·ModelListing

4. Configuration model

~/.config/.../qodeassist/config/
  providers/*.toml   → ProviderInstance { name, client_api, url, api_key_ref }
  agents/*.toml      → AgentConfig { schema_version, providerInstance, model,
                                     endpoint, system_prompt, [body], match,
                                     enable_tools, enable_thinking, cache_prompt,
                                     extends, abstract, hidden, tags }
  agent_models.json  → per-agent model override (applied by AgentFactory)
  agent_roles/*.json → role text, pulled into system_prompt via {{ agent_role(id) }}
  pipelines rosters  → codeCompletion / chatAssistant / chatCompression / quickRefactor
                       consumed by AgentRouter.pickAgent(roster, {filePath, projectName})

Editor policy (NOT agent config):
  CodeCompletionSettings — triggers, modelOutputHandler, context extraction,
                           useOpenFilesContext

[body] is the request body (deep-mergeable through extends; Jinja-bearing string values render and splice as raw JSON, literals pass through, empty renders drop the key). include resolves only sandboxed partial roots. Profiles validate at load: a referenced partial must resolve and the assembled body must parse as JSON against a synthetic context — config errors surface in the agents settings page, never as a silent runtime drop. Full spec: agent-templates-design.md.


5. Runtime paths

AgentRouter.pickAgent(roster, {file, project}) is the only agent picker; every pipeline resolves its agent through a roster.

5a. Code completion

Qt Creator LSP (getCompletionsCycling)
  ▼
LLMClientInterface
  agent   = AgentRouter.pickAgent(roster.codeCompletion, {file, project})
  session = sessionManager.acquire(agent)                 — pooled
  systemPrompt layer "completion.context" = fileContext + open-files context
  session.send( blocks{ CompletionContent(prefix, suffix) }, tools=off )
     ▼ on Session::finished:
  history().lastAssistantText() → CodeHandler (output-mode) → LSP items
     → sessionManager.release(session)

The completion context travels as a CompletionContent block; the template exposes it as ctx.prefix / ctx.suffix. FIM vs instruct is purely agent config (the body), not feature code. Completion never touches the delta stream — it waits for finished and reads the last message.

5b. Chat

ChatRootView owns one persistent ConversationHistory for the whole chat view and injects it into every collaborator. History is the single source of truth.

ChatRootView (QML)  — owns ConversationHistory m_history
  ChatModel.setHistory(m_history)          — ChatModel is a PROJECTION:
        subscribes to messageAdded/Updated/cleared/reset, flattens blocks→rows,
        overlays file-edit status from ChangesManager, holds a per-message usage map
  ChatAgentController                       — agent list filtered to the
        chatAssistant roster; active agent persisted
  ▼ dispatchSend
ClientInterface
  session = sessionManager.createSession(activeAgent, m_history)
  sessionManager.toolContributors().contribute(client.tools())   — builtin+skills+MCP
  session.setContentLoader(ChatSerializer::loadContentFromStorage)
  systemPrompt layer "chat.context" = project info + skills + linked files
  session.send( blocks{ TextContent + StoredAttachmentContent + StoredImageContent } )
     ▼ consumes Session signals (NOT raw client signals):
  event(Usage)        → ChatModel.setMessageUsage + token-counter calibration
  finished(id)        → ChangesManager.applyPendingEditsForRequest + persist;
                        removeSession (the persistent history survives)
  failed(id, ErrorInfo) → surface error; removeSession

ChatCompressor    → acquire(chatCompression-roster agent) → seed history from the
                    chat's messages → "compression" layer → send → read summary from
                    the compression session's own history → release
InputTokenCounter → estimates over ConversationHistory (calibrated by Usage events)
ChatSerializer    → persists ConversationHistory via MessageSerializer (v0.3);
                    imports legacy v0.1/v0.2 files

ChatModel's QML role surface (roleType / content / attachments / images / isRedacted / token roles) is unchanged, so the QML delegates were untouched. The projection's incremental updates avoid model resets on the streaming hot path.

5c. Quick refactor

QodeAssistClient.requestQuickRefactor → QuickRefactorHandler
  agent   = AgentRouter.pickAgent(roster.quickRefactor, {file, project})
  session = sessionManager.acquire(agent)
  if useTools: sessionManager.toolContributors().contribute(client.tools())
  systemPrompt layer "refactor" = tagged selection + output + indentation rules
  session.send(blocks{instructions}, useTools)
     ▼ on Session::finished:
  history().lastAssistantText() → ResponseCleaner → RefactorResult → editor insert
     → sessionManager.release(session)
  on Session::failed(ErrorInfo) → RefactorResult{error}

6. Context layer

The context services sit behind IDE-agnostic ports; Qt Creator API use lives in the adapters.

EditorContext   — IDocumentReader (port)  ← DocumentReaderQtCreator (TextEditor API)
ProjectContext  — IProjectScanner (port)  ← ProjectScannerQtCreator (ProjectExplorer
                  + Core::DocumentModel + the IgnoreManager for .qodeassistignore)
TokenEstimator  — TokenUtils (pure)       ← InputTokenCounter (thin UI consumer)

ContextManager is now Qt-Creator-free: it delegates open-file enumeration and ignore filtering to an injected IProjectScanner (defaulting to the QtC adapter), and keeps only filesystem reads + formatting. ContextManager::shouldIgnore(path) replaced the previously exposed ignoreManager().


7. Cross-cutting

  • Request lifecycle — a session has at most one in-flight request; send() while in flight cancels the previous. Every request ends in exactly one of finished / failed / cancelled. Cancellation is not an error; no consumer string-matches a message to tell them apart.
  • Typed errorsErrorInfo { category ∈ {Config, Auth, Network, Provider, Validation, Tool}, message, providerDetail }. ResponseRouter categorizes wire errors (best-effort) at the boundary; Session::failed carries the typed value.
  • ToolsSessionManager owns a ToolContributorRegistry; built-in ToolKit, the skill tool, and MCP client tools register once and are contributed to chat and quick-refactor session clients uniformly.
  • Threading — the core runs on the GUI thread; concurrency is the Qt event loop plus async network I/O. Blocking work hides behind L3 ports.

8. Tests

test/ (GTest + Qt::Test) covers the two engines most affected by the rework:

  • JsonPromptTemplateTest — the [body] engine: jinja render + JSON splice, literal passthrough, empty-render key drop, nested literals, and load-time rejection of bodies that render invalid JSON.
  • ResponseRouterTest — a fake BaseClient replays a recorded provider stream; asserts the assistant message is stamped with the request id, history is built correctly (thinking + text + tool use/result), the typed event stream is emitted, and wire errors are categorized.
  • BundledAgentsTest — loads every bundled agent through the real loader (extends
    • partials resolved from the qrc) and renders each [body] against the synthetic validation context. This is the load-time validation guarantee run in CI: a broken bundled body, partial, or extends chain fails the test instead of surfacing as a silent runtime drop.

9. Remaining follow-ups (optional)

  1. Qt-Creator-free core build + CIAgentFactory / ContextRenderer still call Core::ICore::userResourcePath, so the core targets link QtCreator::Core. A ResourcePaths port + adapter would let the core build without Qt Creator and enable a CI job that fails on a layering-violating include. (The bundled-agent render check already runs in the QtC-linked test binary — see §8.)
  2. §9 target module layout — the core/ ide/ features/ hosts/ physical target split in target-architecture.md is not yet reflected in the directory layout.