Files
QodeAssist/docs/architecture.md
2026-06-09 08:46:45 +02:00

12 KiB

QodeAssist Architecture

This document describes the runtime architecture of QodeAssist after the migration of all LLM runtime paths onto the agent / Session stack ("Stack B"). Every runtime LLM path — code completion, chat (send/stream + compression + token counting), and quick refactor — now goes through agents, Session, and the Providers::GenericProvider layer.

Legend: = on Stack B (active runtime), 🔴 = legacy Stack A (isolated, no runtime consumers left).


1. Top level: ownership and dependency injection

The plugin (qodeassist.cpp) owns everything via new + parent (no plugin-wide singletons; each feature receives its dependencies explicitly).

QodeAssistPlugin
  Stack B infrastructure:
    • Providers::registerBuiltinProviders()   — registers 13 client_api types
    • ProviderInstanceFactory                 — 14 instances from TOML
    • ProviderSecretsStore
    • AgentFactory                            — agents from TOML
    • SessionManager(agentFactory)
    • m_engine (QQmlEngine)
        rootContext: "agentFactory", "sessionManager"   — DI for chat (QML)

  Wired into consumers:
    • QodeAssistClient ← LLMClientInterface(*sessionManager, *agentFactory)
                       ← setSessionManager / setAgentFactory   (for quick refactor)

Chat lives in QML (ChatRootView is a QML_ELEMENT), so AgentFactory and SessionManager are exposed as context properties on the engine's root context and resolved in ChatRootView via qmlEngine(this)->rootContext()->contextProperty(...).


2. Stack B core (agent / Session)

AgentFactory.create(name)
  configByName(name) → AgentConfig (TOML)
     providerInstance, model, endpoint, role, messageFormat,
     sampling, enableTools, enableThinking, match{filePatterns,...}
  buildProviderForAgent:
     instance = ProviderInstanceFactory.instanceByName(cfg.providerInstance)
     provider = ProviderFactory::create(instance.clientApi)        ◄── keystone
     provider.setUrl(instance.url)
     provider.setApiKey(secrets.read(instance.apiKeyRef))
  ▼
Agent(config, provider)
  promptTemplate = JsonPromptTemplate::fromConfig(cfg.messageFormat)   (inja)
  ▼
SessionManager.createSession(agentName) → Session(agent)
  ├─ ConversationHistory     — messages as ContentBlocks
  ├─ SystemPromptBuilder     — layers: agent.role + caller layers
  └─ ResponseRouter(client)  — emits ResponseEvent

Session API:
  • send(blocks, toolsOverride)   — chat/refactor: append user msg + dispatch
  • sendCompletion(ContextData)   — completion: FIM prefix/suffix
  • client()                      — agent's LLMQore::BaseClient (direct streaming)
  • systemPrompt()->setLayer(...) — dynamic context layers
  • supportsImages()              — provider Image capability
  • history()                     — for seeding from ChatModel

Session::sendCompletion and dispatch compose SystemPromptBuilder layers (agent.role + caller-provided) into the request system prompt.


3. Provider layer — the keystone (implemented during migration)

The Stack B provider layer previously existed only as an abstract base + empty factory (registerType was never called, no concrete providers). This blocked every agent from obtaining a working provider. It is now implemented via a single configuration-driven GenericProvider.

ProviderFactory  (sources/providers, namespace functions)
   registerType(name, fn) / create(name, parent) / knownNames()
        ▲
        │ registerBuiltinProviders()   — client_api → provider table
        │
GenericProvider : Providers::Provider
   • owns an LLMQore::BaseClient (created by a ClientFactory)
   • prepareRequest — inherited from Provider base:
        delegates to PromptTemplate::buildFullRequest
   • client() / providerID() / capabilities() / getInstalledModels()

client_api → provider table

client_api LLMQore client ProviderID capabilities
Claude ClaudeClient Claude Tools·Thinking·Image·ModelListing
Google AI GoogleAIClient GoogleAI Tools·Thinking·Image·ModelListing
llama.cpp LlamaCppClient LlamaCpp Tools·Thinking·Image·ModelListing
Mistral AI MistralClient MistralAI Tools·Thinking·Image·ModelListing
Codestral MistralClient MistralAI Tools·Image
Ollama (Native) OllamaClient Ollama Tools·Thinking·Image·ModelListing
Ollama (OpenAI-compatible) OpenAIClient OpenAICompatible Tools·Thinking·Image·ModelListing
OpenAI (Chat Completions) OpenAIClient OpenAI Tools·Thinking·Image·ModelListing
OpenAI (Responses API) OpenAIResponsesClient OpenAIResponses Tools·Thinking·Image·ModelListing
OpenAI Compatible OpenAIClient OpenAICompatible Tools·Image·Thinking
OpenRouter OpenAIClient OpenRouter Tools·Image·Thinking·ModelListing
LM Studio (Chat Completions) OpenAIClient LMStudio Tools·Thinking·Image·ModelListing
LM Studio (Responses API) OpenAIResponsesClient OpenAIResponses Tools·Thinking·Image·ModelListing

Request shape comes from the agent's prompt template (jinja messageFormat), so a single provider class covers every API by varying only the client factory and metadata.


4. Runtime paths (all on Stack B)

4a. Code completion

Qt Creator LSP (getCompletionsCycling)
  ▼
LLMClientInterface
  pickCompletionAgent: AgentRouter.pickAgent(roster.codeCompletion, {file, project})
  session = sessionManager.createSession(agent)
  ctx = Templates::ContextData{ prefix, suffix,
                                systemPrompt = fileContext + openFiles }
  session.sendCompletion(ctx)
     ▼ stream from session.client():
  requestCompleted → sendCompletionToClient → CodeHandler → LSP
  system prompt = agent.role; FIM template renders prefix/suffix

4b. Chat

ChatRootView (QML)
  resolve agentFactory()/sessionManager() = qmlEngine(this)->rootContext()
  ChatAgentController: agent list (configNames), active agent (persisted),
                       supportsThinking/Tools
  QML agent picker (TopBar.agentSelector) — replaced provider/model/template combos
  ▼ dispatchSend
ClientInterface
  session = sessionManager.createSession(currentChatAgent)
  registerQodeAssistTools(session.client().tools()) + registerSkillTool
  systemPrompt layer "chat.context" = project info + skills + linked files
  seedHistory(session.history() ← ChatModel: user/assistant/tool-call+result)
  session.send(userBlocks{text + images}, useTools)
     ▼ stream from session.client() → existing handlers → ChatModel:
  chunk→addMessage  thinking→addThinkingBlock
  tool→addToolExecutionStatus / updateToolResult
  finalized→usage   completed→messageReceivedCompletely → removeSession

ChatCompressor    → createSession(agent) → seed history → layer "compression" → send(prompt)
InputTokenCounter → estimate without provider (calibrated by server usage)

4c. Quick refactor

QodeAssistClient.requestQuickRefactor → QuickRefactorHandler (setSessionManager/setAgentFactory)
  pickRefactorAgent: AgentRouter.pickAgent(roster.quickRefactor, {file, project})
  session = createSession(agent)
  if useTools: registerQodeAssistTools(session.client().tools())
  systemPrompt layer "refactor" = buildSystemPrompt(tagged content +
                                  output requirements + indentation rules)
  session.send(blocks{instructions}, useTools)
     ▼ stream from session.client():
  requestCompleted → ResponseCleaner → RefactorResult → insert into editor

5. Configuration sources

~/.config/.../qodeassist/config/
  providers/*.toml   → ProviderInstance { name, client_api, url, api_key_ref }
  agents/*.toml      → AgentConfig { providerInstance, model, endpoint, role,
                                     messageFormat, sampling, match, enable* }
  pipelines rosters  → codeCompletion / chatAssistant / chatCompression / quickRefactor
                       consumed by AgentRouter.pickAgent(roster, {filePath, projectName})

Editor policy (NOT agent config):
  CodeCompletionSettings — triggers, modelOutputHandler, context extraction,
                           useOpenFilesContext
                           (sampling / prompt-generation fields removed)

6. Remaining Stack A (runtime does NOT depend on it)

🔴 Settings UI: provider/model/template selection pages
                (ccProvider / caProvider / qrProvider) + ConfigurationManager
                → use ProvidersManager
🔴 root providers/*  (PluginLLMCore::Provider, 14 classes)
                → read only chat/quick-refactor sampling settings
🔴 pluginllmcore/*   (ProvidersManager, PromptTemplateManager, ResponseCleaner,
                      PromptProviderChat/Fim, ContextData)
🔴 qodeassist.cpp:144-146  registerProviders() / registerTemplates()  (Stack A registration)
🔴 qodeassist.cpp:185      MCP skill-tool loop on Stack A providers  (effectively dead)
🔴 ChatAssistantSettings / QuickRefactorSettings — sampling fields (read only by root providers)

ResponseCleaner (pluginllmcore) is still used by QuickRefactorHandler as a text
utility — orthogonal to the provider stack.

Removed during the migration

  • Rules subsystem (RulesLoader + chat "active rules" UI + QuickRefactor rules block)
  • ChatConfigurationController, AgentRoleController (chat config/role presets)
  • m_promptProvider (PromptProviderFim) in the plugin
  • RequestType::CodeCompletion branch in all 14 root providers
  • Sampling / prompt-generation fields in CodeCompletionSettings
  • ChatView no longer links PluginLLMCore

7. Dependency summary

                 ┌──────────────── Stack B (active runtime) ────────────────┐
LLMClientInterface ─┐                                                        │
ClientInterface ────┼─► SessionManager ─► Session ─► Agent ─► GenericProvider ─► LLMQore::*Client
QuickRefactorHandler─┘        │              │         │            │
ChatCompressor ──────────────┘              │      AgentFactory  ProviderFactory
                                  AgentRouter (rosters)  │            │
                                                ProviderInstanceFactory (TOML)
                 └──────────────────────────────────────────────────────────┘

   Stack A (settings UI + ConfigurationManager + MCP loop) — isolated,
   no runtime consumers remain.

8. Open follow-ups (optional)

  1. Chat picker filtering — show only chatAssistant-roster agents (currently lists all non-hidden agents; the auto-default may land on a FIM agent). Requires wiring ChatView to PipelinesConfig (watch for OBJECT-library symbol duplication).
  2. MCP tools on agent clients — MCP skill tools are registered only on Stack A providers; to expose MCP tools to chat agents, register them on the session client alongside registerQodeAssistTools.
  3. Physical Stack A teardown — remove the provider/model/template settings UI, ConfigurationManager, root providers/*, pluginllmcore/*, and the registration + MCP loop in qodeassist.cpp. Runtime no longer depends on them.
  4. Per-message session cost — chat/refactor create a fresh agent/provider/client (and read secrets) per request; a session pool could reduce latency.