mirror of
https://github.com/Palm1r/QodeAssist.git
synced 2026-06-14 02:09:22 -04:00
refactor: Move to agent-session architecture
This commit is contained in:
258
docs/architecture.md
Normal file
258
docs/architecture.md
Normal file
@@ -0,0 +1,258 @@
|
||||
# QodeAssist Architecture
|
||||
|
||||
This document describes the runtime architecture of QodeAssist after the
|
||||
migration of all LLM runtime paths onto the agent / `Session` stack
|
||||
("Stack B"). Every runtime LLM path — code completion, chat (send/stream +
|
||||
compression + token counting), and quick refactor — now goes through agents,
|
||||
`Session`, and the `Providers::GenericProvider` layer.
|
||||
|
||||
> Legend: ✅ = on Stack B (active runtime), 🔴 = legacy Stack A (isolated, no
|
||||
> runtime consumers left).
|
||||
|
||||
---
|
||||
|
||||
## 1. Top level: ownership and dependency injection
|
||||
|
||||
The plugin (`qodeassist.cpp`) owns everything via `new` + parent (no plugin-wide
|
||||
singletons; each feature receives its dependencies explicitly).
|
||||
|
||||
```
|
||||
QodeAssistPlugin
|
||||
Stack B infrastructure:
|
||||
• Providers::registerBuiltinProviders() — registers 13 client_api types
|
||||
• ProviderInstanceFactory — 14 instances from TOML
|
||||
• ProviderSecretsStore
|
||||
• AgentFactory — agents from TOML
|
||||
• SessionManager(agentFactory)
|
||||
• m_engine (QQmlEngine)
|
||||
rootContext: "agentFactory", "sessionManager" — DI for chat (QML)
|
||||
|
||||
Wired into consumers:
|
||||
• QodeAssistClient ← LLMClientInterface(*sessionManager, *agentFactory)
|
||||
← setSessionManager / setAgentFactory (for quick refactor)
|
||||
```
|
||||
|
||||
Chat lives in QML (`ChatRootView` is a `QML_ELEMENT`), so `AgentFactory` and
|
||||
`SessionManager` are exposed as **context properties on the engine's root
|
||||
context** and resolved in `ChatRootView` via
|
||||
`qmlEngine(this)->rootContext()->contextProperty(...)`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Stack B core (agent / Session)
|
||||
|
||||
```
|
||||
AgentFactory.create(name)
|
||||
configByName(name) → AgentConfig (TOML)
|
||||
providerInstance, model, endpoint, role, messageFormat,
|
||||
sampling, enableTools, enableThinking, match{filePatterns,...}
|
||||
buildProviderForAgent:
|
||||
instance = ProviderInstanceFactory.instanceByName(cfg.providerInstance)
|
||||
provider = ProviderFactory::create(instance.clientApi) ◄── keystone
|
||||
provider.setUrl(instance.url)
|
||||
provider.setApiKey(secrets.read(instance.apiKeyRef))
|
||||
▼
|
||||
Agent(config, provider)
|
||||
promptTemplate = JsonPromptTemplate::fromConfig(cfg.messageFormat) (inja)
|
||||
▼
|
||||
SessionManager.createSession(agentName) → Session(agent)
|
||||
├─ ConversationHistory — messages as ContentBlocks
|
||||
├─ SystemPromptBuilder — layers: agent.role + caller layers
|
||||
└─ ResponseRouter(client) — emits ResponseEvent
|
||||
|
||||
Session API:
|
||||
• send(blocks, toolsOverride) — chat/refactor: append user msg + dispatch
|
||||
• sendCompletion(ContextData) — completion: FIM prefix/suffix
|
||||
• client() — agent's LLMQore::BaseClient (direct streaming)
|
||||
• systemPrompt()->setLayer(...) — dynamic context layers
|
||||
• supportsImages() — provider Image capability
|
||||
• history() — for seeding from ChatModel
|
||||
```
|
||||
|
||||
`Session::sendCompletion` and `dispatch` compose `SystemPromptBuilder` layers
|
||||
(`agent.role` + caller-provided) into the request system prompt.
|
||||
|
||||
---
|
||||
|
||||
## 3. Provider layer — the keystone (implemented during migration)
|
||||
|
||||
The Stack B provider layer previously existed only as an abstract base +
|
||||
empty factory (`registerType` was never called, no concrete providers). This
|
||||
blocked every agent from obtaining a working provider. It is now implemented
|
||||
via a single configuration-driven `GenericProvider`.
|
||||
|
||||
```
|
||||
ProviderFactory (sources/providers, namespace functions)
|
||||
registerType(name, fn) / create(name, parent) / knownNames()
|
||||
▲
|
||||
│ registerBuiltinProviders() — client_api → provider table
|
||||
│
|
||||
GenericProvider : Providers::Provider
|
||||
• owns an LLMQore::BaseClient (created by a ClientFactory)
|
||||
• prepareRequest — inherited from Provider base:
|
||||
delegates to PromptTemplate::buildFullRequest
|
||||
• client() / providerID() / capabilities() / getInstalledModels()
|
||||
```
|
||||
|
||||
### client_api → provider table
|
||||
|
||||
| client_api | LLMQore client | ProviderID | capabilities |
|
||||
|--------------------------------|-------------------------|------------------|-------------------------|
|
||||
| Claude | ClaudeClient | Claude | Tools·Thinking·Image·ModelListing |
|
||||
| Google AI | GoogleAIClient | GoogleAI | Tools·Thinking·Image·ModelListing |
|
||||
| llama.cpp | LlamaCppClient | LlamaCpp | Tools·Thinking·Image·ModelListing |
|
||||
| Mistral AI | MistralClient | MistralAI | Tools·Thinking·Image·ModelListing |
|
||||
| Codestral | MistralClient | MistralAI | Tools·Image |
|
||||
| Ollama (Native) | OllamaClient | Ollama | Tools·Thinking·Image·ModelListing |
|
||||
| Ollama (OpenAI-compatible) | OpenAIClient | OpenAICompatible | Tools·Thinking·Image·ModelListing |
|
||||
| OpenAI (Chat Completions) | OpenAIClient | OpenAI | Tools·Thinking·Image·ModelListing |
|
||||
| OpenAI (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing |
|
||||
| OpenAI Compatible | OpenAIClient | OpenAICompatible | Tools·Image·Thinking |
|
||||
| OpenRouter | OpenAIClient | OpenRouter | Tools·Image·Thinking·ModelListing |
|
||||
| LM Studio (Chat Completions) | OpenAIClient | LMStudio | Tools·Thinking·Image·ModelListing |
|
||||
| LM Studio (Responses API) | OpenAIResponsesClient | OpenAIResponses | Tools·Thinking·Image·ModelListing |
|
||||
|
||||
Request *shape* comes from the agent's prompt template (jinja `messageFormat`),
|
||||
so a single provider class covers every API by varying only the client factory
|
||||
and metadata.
|
||||
|
||||
---
|
||||
|
||||
## 4. Runtime paths (all on Stack B)
|
||||
|
||||
### 4a. Code completion ✅
|
||||
|
||||
```
|
||||
Qt Creator LSP (getCompletionsCycling)
|
||||
▼
|
||||
LLMClientInterface
|
||||
pickCompletionAgent: AgentRouter.pickAgent(roster.codeCompletion, {file, project})
|
||||
session = sessionManager.createSession(agent)
|
||||
ctx = Templates::ContextData{ prefix, suffix,
|
||||
systemPrompt = fileContext + openFiles }
|
||||
session.sendCompletion(ctx)
|
||||
▼ stream from session.client():
|
||||
requestCompleted → sendCompletionToClient → CodeHandler → LSP
|
||||
system prompt = agent.role; FIM template renders prefix/suffix
|
||||
```
|
||||
|
||||
### 4b. Chat ✅
|
||||
|
||||
```
|
||||
ChatRootView (QML)
|
||||
resolve agentFactory()/sessionManager() = qmlEngine(this)->rootContext()
|
||||
ChatAgentController: agent list (configNames), active agent (persisted),
|
||||
supportsThinking/Tools
|
||||
QML agent picker (TopBar.agentSelector) — replaced provider/model/template combos
|
||||
▼ dispatchSend
|
||||
ClientInterface
|
||||
session = sessionManager.createSession(currentChatAgent)
|
||||
registerQodeAssistTools(session.client().tools()) + registerSkillTool
|
||||
systemPrompt layer "chat.context" = project info + skills + linked files
|
||||
seedHistory(session.history() ← ChatModel: user/assistant/tool-call+result)
|
||||
session.send(userBlocks{text + images}, useTools)
|
||||
▼ stream from session.client() → existing handlers → ChatModel:
|
||||
chunk→addMessage thinking→addThinkingBlock
|
||||
tool→addToolExecutionStatus / updateToolResult
|
||||
finalized→usage completed→messageReceivedCompletely → removeSession
|
||||
|
||||
ChatCompressor → createSession(agent) → seed history → layer "compression" → send(prompt)
|
||||
InputTokenCounter → estimate without provider (calibrated by server usage)
|
||||
```
|
||||
|
||||
### 4c. Quick refactor ✅
|
||||
|
||||
```
|
||||
QodeAssistClient.requestQuickRefactor → QuickRefactorHandler (setSessionManager/setAgentFactory)
|
||||
pickRefactorAgent: AgentRouter.pickAgent(roster.quickRefactor, {file, project})
|
||||
session = createSession(agent)
|
||||
if useTools: registerQodeAssistTools(session.client().tools())
|
||||
systemPrompt layer "refactor" = buildSystemPrompt(tagged content +
|
||||
output requirements + indentation rules)
|
||||
session.send(blocks{instructions}, useTools)
|
||||
▼ stream from session.client():
|
||||
requestCompleted → ResponseCleaner → RefactorResult → insert into editor
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Configuration sources
|
||||
|
||||
```
|
||||
~/.config/.../qodeassist/config/
|
||||
providers/*.toml → ProviderInstance { name, client_api, url, api_key_ref }
|
||||
agents/*.toml → AgentConfig { providerInstance, model, endpoint, role,
|
||||
messageFormat, sampling, match, enable* }
|
||||
pipelines rosters → codeCompletion / chatAssistant / chatCompression / quickRefactor
|
||||
consumed by AgentRouter.pickAgent(roster, {filePath, projectName})
|
||||
|
||||
Editor policy (NOT agent config):
|
||||
CodeCompletionSettings — triggers, modelOutputHandler, context extraction,
|
||||
useOpenFilesContext
|
||||
(sampling / prompt-generation fields removed)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Remaining Stack A (runtime does NOT depend on it)
|
||||
|
||||
```
|
||||
🔴 Settings UI: provider/model/template selection pages
|
||||
(ccProvider / caProvider / qrProvider) + ConfigurationManager
|
||||
→ use ProvidersManager
|
||||
🔴 root providers/* (PluginLLMCore::Provider, 14 classes)
|
||||
→ read only chat/quick-refactor sampling settings
|
||||
🔴 pluginllmcore/* (ProvidersManager, PromptTemplateManager, ResponseCleaner,
|
||||
PromptProviderChat/Fim, ContextData)
|
||||
🔴 qodeassist.cpp:144-146 registerProviders() / registerTemplates() (Stack A registration)
|
||||
🔴 qodeassist.cpp:185 MCP skill-tool loop on Stack A providers (effectively dead)
|
||||
🔴 ChatAssistantSettings / QuickRefactorSettings — sampling fields (read only by root providers)
|
||||
|
||||
ResponseCleaner (pluginllmcore) is still used by QuickRefactorHandler as a text
|
||||
utility — orthogonal to the provider stack.
|
||||
```
|
||||
|
||||
### Removed during the migration
|
||||
|
||||
- Rules subsystem (`RulesLoader` + chat "active rules" UI + QuickRefactor rules block)
|
||||
- `ChatConfigurationController`, `AgentRoleController` (chat config/role presets)
|
||||
- `m_promptProvider` (`PromptProviderFim`) in the plugin
|
||||
- `RequestType::CodeCompletion` branch in all 14 root providers
|
||||
- Sampling / prompt-generation fields in `CodeCompletionSettings`
|
||||
- ChatView no longer links `PluginLLMCore`
|
||||
|
||||
---
|
||||
|
||||
## 7. Dependency summary
|
||||
|
||||
```
|
||||
┌──────────────── Stack B (active runtime) ────────────────┐
|
||||
LLMClientInterface ─┐ │
|
||||
ClientInterface ────┼─► SessionManager ─► Session ─► Agent ─► GenericProvider ─► LLMQore::*Client
|
||||
QuickRefactorHandler─┘ │ │ │ │
|
||||
ChatCompressor ──────────────┘ │ AgentFactory ProviderFactory
|
||||
AgentRouter (rosters) │ │
|
||||
ProviderInstanceFactory (TOML)
|
||||
└──────────────────────────────────────────────────────────┘
|
||||
|
||||
Stack A (settings UI + ConfigurationManager + MCP loop) — isolated,
|
||||
no runtime consumers remain.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Open follow-ups (optional)
|
||||
|
||||
1. **Chat picker filtering** — show only `chatAssistant`-roster agents (currently
|
||||
lists all non-hidden agents; the auto-default may land on a FIM agent).
|
||||
Requires wiring ChatView to `PipelinesConfig` (watch for OBJECT-library
|
||||
symbol duplication).
|
||||
2. **MCP tools on agent clients** — MCP skill tools are registered only on Stack A
|
||||
providers; to expose MCP tools to chat agents, register them on the session
|
||||
client alongside `registerQodeAssistTools`.
|
||||
3. **Physical Stack A teardown** — remove the provider/model/template settings UI,
|
||||
`ConfigurationManager`, root `providers/*`, `pluginllmcore/*`, and the
|
||||
registration + MCP loop in `qodeassist.cpp`. Runtime no longer depends on them.
|
||||
4. **Per-message session cost** — chat/refactor create a fresh agent/provider/client
|
||||
(and read secrets) per request; a session pool could reduce latency.
|
||||
```
|
||||
Reference in New Issue
Block a user