mirror of
https://github.com/Palm1r/QodeAssist.git
synced 2026-06-14 02:09:22 -04:00
387 lines
15 KiB
Markdown
387 lines
15 KiB
Markdown
# Agent Templates — Design Note (body model, include, extends)
|
|
|
|
Status: agreed design / ready to implement. Dev-facing (not end-user docs).
|
|
Scope: how agent TOML profiles describe the request and share structure.
|
|
|
|
## Problem this replaces
|
|
|
|
The shipped model has each agent embed a `[template].message_format` jinja string
|
|
that hand-builds the **whole** request body as text, plus `[template.sampling]` and
|
|
`[template.thinking.*]` blocks merged in by `applySampling`. Pains:
|
|
|
|
- Massive copy-paste: 9 OpenAI-compatible agents share a byte-identical ~50-line
|
|
`message_format`; 4 Claude agents share another; `role` + README `context` are
|
|
identical across 18 files.
|
|
- `[template.sampling]` / `[template.thinking.overrides]` /
|
|
`[template.thinking.request_block.*]` describe **merge machinery**, not the request
|
|
body — they don't look like the actual API call. The `overrides` vs `request_block`
|
|
split is meaningless (both are deep-merged into the request identically).
|
|
- Manual JSON-by-string-concatenation: trailing-comma bookkeeping
|
|
(`{% if not loop.is_last %},{% endif %}`) everywhere; a missing comma fails
|
|
silently at runtime (`renderBody` returns nullopt, only a `qWarning`).
|
|
- `include` is hard-disabled, so there is no way to share a sub-fragment.
|
|
|
|
## Agreed model
|
|
|
|
### 1. `[body]` is a deep-mergeable table = the request body, 1:1 with the API
|
|
|
|
Replace the `message_format` string and the `sampling`/`thinking` blocks with a
|
|
single `[body]` TOML table whose keys are the **literal request-body fields**.
|
|
Because it is a table (not a string), `extends` / `deepMerge` can override it
|
|
field-by-field — variants become a 2-line delta instead of a copied body.
|
|
|
|
Field-value rules at build time (per key in `[body]`, applied recursively):
|
|
- **string containing jinja** (`{{` or `{%`) → render through inja, splice the
|
|
output as **raw JSON** (array / object / string). Empty render → key omitted.
|
|
- **string without jinja** (e.g. `"high"`) → literal JSON string, as-is.
|
|
- **number / bool / inline-table** → as-is.
|
|
|
|
So `messages` / `contents` and `system` / `system_instruction` are just **string
|
|
fields holding jinja**; everything else (`max_tokens`, `temperature`, `stream`,
|
|
`thinking`, `output_config`, `generationConfig`, …) is a literal value that reads
|
|
exactly like the curl body.
|
|
|
|
No runtime toggles: thinking / tools / streaming are **fixed per agent**. A thinking
|
|
agent literally carries the `thinking` fields; a non-thinking variant is a separate
|
|
file. There is no `{% if thinking %}` in the body. `system` uses
|
|
`{% if existsIn(ctx, "system_prompt") %}` only because that is about *presence of
|
|
data*, not a mode toggle. `enable_thinking` / `enable_tools` are **capability hints**
|
|
(used for UI badges and to decide tool-definition injection) — the body is the source
|
|
of truth for what is actually sent, so a thinking agent's body must carry the thinking
|
|
fields regardless of the flag.
|
|
|
|
Outside the body:
|
|
- `model` — the TOML `model` is the **default**; a per-agent override chosen in
|
|
QodeAssist settings wins. Overrides are stored in `agent_models.json`
|
|
(agentName → model) and applied by `AgentFactory` when it builds the agent
|
|
(`AgentFactory::effectiveModel`/`setModelOverride`); `Session` still seeds the
|
|
payload `model` from the resolved `cfg.model`. URL-model providers (Google) put a
|
|
`${MODEL}` placeholder in `endpoint`; `Session` substitutes the resolved model into
|
|
the endpoint before sending (same substitution style as `${PROJECT_DIR}`/`${HOME}`),
|
|
so the override drives the URL too.
|
|
- `tools` — injected by the **provider** when `enable_tools` is set (tool
|
|
definitions are dynamic, from `ToolsManager`; they can't be authored in TOML).
|
|
- `stream` — always on. Literal `"stream": true` in the body for OpenAI / Claude /
|
|
Mistral / Responses / Ollama; encoded in the `endpoint` URL for Google.
|
|
|
|
### 2. `include` re-enabled as whitelisted partials
|
|
|
|
The message-array rendering (the complex, comma-heavy part) lives in
|
|
`sources/agents/partials/*.jinja`, shared via `{% include %}`. The throwing include
|
|
callback is replaced by a sandboxed resolver that:
|
|
- rejects names containing `..`, a leading `/`, or a scheme/drive;
|
|
- resolves only against known roots: bundled `:/agents/partials/` then the user
|
|
`partials/` dir;
|
|
- parses/caches the partial in the same `inja::Environment`.
|
|
|
|
A missing/typo'd partial is a **load-time** error.
|
|
|
|
### 3. `extends` shares config down a hierarchy
|
|
|
|
`extends` already exists (`resolveExtends` + `deepMerge` + `abstract`/`hidden`); it
|
|
keeps doing what it does, now over the structured `[body]` too. A flat 2 levels — each
|
|
provider base sets `system_prompt = """{{ agent_role("developer") }}"""` (the role text
|
|
comes from the role JSON via the `agent_role` callback; see below). No shared root base:
|
|
|
|
```
|
|
openai_base (abstract) → system_prompt + provider/endpoint/enable_tools + [body]
|
|
├─ openai_chat → name
|
|
├─ mistral_chat → name, provider, endpoint
|
|
└─ mistral_reasoning → + enable_thinking
|
|
anthropic_base (abstract) → system_prompt + provider/endpoint + [body]
|
|
├─ claude_chat → name
|
|
└─ claude_sonnet → + [body] thinking / output_config
|
|
google_base (abstract) → system_prompt + provider + [body]
|
|
└─ gemini_chat → endpoint (${MODEL}) + [body.generationConfig] thinkingConfig
|
|
```
|
|
|
|
Notes:
|
|
- `[body]` is shared whole when identical (the 8 OpenAI-compatible providers); a
|
|
variant overrides only the differing field — no duplicated body.
|
|
- Arrays (`tags`) are **replaced** on override, not appended (`deepMerge` recurses
|
|
objects only). A child that wants base tags + extras restates the full list.
|
|
- Division of labour: **include** shares the message-rendering fragment across
|
|
unrelated families; **extends** shares config (system_prompt / endpoint / body)
|
|
down one inheritance chain.
|
|
- With `model` gone, per-model files collapse: agents that previously differed only
|
|
by `model` become one agent (the client picks the model). A separate file is only
|
|
needed when the body genuinely differs (effort, no-thinking, …).
|
|
|
|
### System prompt — a composable template with building blocks
|
|
|
|
The old `role` (static text) and `context` (jinja) layers collapse into one
|
|
`agent.system` layer in `Session`, rendered through `ContextRenderer`. The agent's
|
|
`system_prompt` field IS that template, and the user edits it (in the profile) to
|
|
compose the prompt from building-block callbacks:
|
|
|
|
- `{{ agent_role("<id>") }}` — insert a role's text (Developer/Reviewer/Researcher…).
|
|
Implemented as a `ContextRenderer` callback (`registerAgentRole`) that reads
|
|
`userResourcePath("qodeassist/agent_roles/<id>.json")["systemPrompt"]`. Returns "" if
|
|
missing. Lives in `sources/agents` (no dependency on `settings/`), so it works in the
|
|
plugin and bench. The role text lives once in the role JSON (managed by the settings
|
|
Roles UI); the chat bases just carry `system_prompt = """{{ agent_role("developer") }}"""`.
|
|
- `{{ read_file("...") }}` / `file_exists` / `${PROJECT_DIR}` / `${HOME}` — existing
|
|
`ContextRenderer` helpers, composable in the same template.
|
|
|
|
So a profile can do `system_prompt = """{{ agent_role("developer") }}\n\n{{ read_file("…") }}"""`.
|
|
`qodeassist.cpp` calls `AgentRolesManager::ensureDefaultRoles()` at startup so the default
|
|
role JSONs exist before agents load. There is NO per-agent settings override — the edit
|
|
point is the profile's `system_prompt`. Code-completion/FIM agents set no `system_prompt`.
|
|
|
|
## Worked examples
|
|
|
|
OpenAI base:
|
|
```toml
|
|
abstract = true
|
|
system_prompt = """{{ agent_role("developer") }}"""
|
|
provider_instance = "OpenAI (Chat Completions)"
|
|
endpoint = "/chat/completions"
|
|
enable_tools = true
|
|
|
|
[body]
|
|
max_tokens = 8192
|
|
temperature = 0.7
|
|
stream = true
|
|
messages = """
|
|
[ {% include "partials/openai_messages.jinja" %} ]
|
|
"""
|
|
```
|
|
|
|
Mistral reasoning child (delta only):
|
|
```toml
|
|
extends = "OpenAI Base Chat"
|
|
name = "Mistral Reasoning Chat"
|
|
provider_instance = "Mistral AI"
|
|
endpoint = "/v1/chat/completions"
|
|
enable_thinking = true
|
|
|
|
[body]
|
|
reasoning_effort = "medium"
|
|
```
|
|
|
|
Claude base (literally the curl body):
|
|
```toml
|
|
abstract = true
|
|
system_prompt = """{{ agent_role("developer") }}"""
|
|
provider_instance = "Claude"
|
|
endpoint = "/v1/messages"
|
|
enable_thinking = true
|
|
enable_tools = true
|
|
|
|
[body]
|
|
max_tokens = 16000
|
|
temperature = 1
|
|
stream = true
|
|
thinking = { type = "adaptive", display = "summarized" }
|
|
output_config = { effort = "high" }
|
|
system = """{% if existsIn(ctx, "system_prompt") %}{{ tojson(ctx.system_prompt) }}{% endif %}"""
|
|
messages = """
|
|
[ {% include "partials/anthropic_messages.jinja" %} ]
|
|
"""
|
|
```
|
|
|
|
Sonnet child (delta only):
|
|
```toml
|
|
extends = "Anthropic Base Chat"
|
|
name = "Claude Sonnet"
|
|
|
|
[body.output_config]
|
|
effort = "medium"
|
|
```
|
|
|
|
Google base (`${MODEL}` in endpoint; streaming in the URL):
|
|
```toml
|
|
abstract = true
|
|
system_prompt = """{{ agent_role("developer") }}"""
|
|
provider_instance = "Google AI"
|
|
endpoint = "/models/${MODEL}:streamGenerateContent?alt=sse"
|
|
enable_thinking = true
|
|
enable_tools = true
|
|
|
|
[body]
|
|
system_instruction = """{% if existsIn(ctx, "system_prompt") %}{ "parts": [ { "text": {{ tojson(ctx.system_prompt) }} } ] }{% endif %}"""
|
|
contents = """
|
|
[ {% include "partials/google_contents.jinja" %} ]
|
|
"""
|
|
|
|
[body.generationConfig]
|
|
maxOutputTokens = 16000
|
|
temperature = 1
|
|
thinkingConfig = { includeThoughts = true, thinkingBudget = 8192 }
|
|
```
|
|
|
|
### Partials
|
|
|
|
`partials/openai_messages.jinja` dispatches per message:
|
|
```jinja
|
|
{% if existsIn(ctx, "system_prompt") %}
|
|
{ "role": "system", "content": {{ tojson(ctx.system_prompt) }} },
|
|
{% endif %}
|
|
{% for msg in ctx.history %}
|
|
{% if msg.role == "assistant" %}{% include "partials/openai_assistant.jinja" %}
|
|
{% else if length(filter_by_type(msg.content_blocks, "tool_result")) > 0 %}{% include "partials/openai_tool_results.jinja" %}
|
|
{% else %}{% include "partials/openai_user.jinja" %}
|
|
{% endif %}
|
|
{% endfor %}
|
|
```
|
|
|
|
`partials/openai_assistant.jinja`:
|
|
```jinja
|
|
{% set tcalls = filter_by_type(msg.content_blocks, "tool_use") %}
|
|
{
|
|
"role": "assistant",
|
|
"content": {{ tojson(msg.content) }}
|
|
{% if length(tcalls) > 0 %}
|
|
, "tool_calls": [
|
|
{% for b in tcalls %}
|
|
{ "id": {{ tojson(b.id) }}, "type": "function",
|
|
"function": { "name": {{ tojson(b.name) }}, "arguments": {{ tojson(tojson(b.input)) }} } },
|
|
{% endfor %}
|
|
]
|
|
{% endif %}
|
|
},
|
|
```
|
|
|
|
`partials/openai_tool_results.jinja`:
|
|
```jinja
|
|
{% for b in filter_by_type(msg.content_blocks, "tool_result") %}
|
|
{ "role": "tool", "tool_call_id": {{ tojson(b.tool_use_id) }}, "content": {{ tojson(b.content) }} },
|
|
{% endfor %}
|
|
```
|
|
|
|
`partials/openai_user.jinja`:
|
|
```jinja
|
|
{% if existsIn(msg, "images") %}
|
|
{ "role": "user", "content": {% include "partials/openai_image_content.jinja" %} },
|
|
{% else %}
|
|
{ "role": "user", "content": {{ tojson(msg.content) }} },
|
|
{% endif %}
|
|
```
|
|
|
|
`partials/openai_image_content.jinja`:
|
|
```jinja
|
|
[
|
|
{ "type": "text", "text": {{ tojson(msg.content) }} }
|
|
{% for img in msg.images %}
|
|
,
|
|
{% if img.is_url %}
|
|
{ "type": "image_url", "image_url": { "url": {{ tojson(img.data) }} } }
|
|
{% else %}
|
|
{ "type": "image_url", "image_url": { "url": "data:{{ img.media_type }};base64,{{ img.data }}" } }
|
|
{% endif %}
|
|
{% endfor %}
|
|
]
|
|
```
|
|
|
|
`partials/anthropic_messages.jinja`:
|
|
```jinja
|
|
{% for msg in ctx.history %}
|
|
{
|
|
"role": {{ tojson(msg.role) }},
|
|
"content": [
|
|
{% for b in msg.content_blocks %}
|
|
{% if b.type == "image" %}{% include "partials/anthropic_image.jinja" %}
|
|
{% else %}{{ tojson(b) }},
|
|
{% endif %}
|
|
{% endfor %}
|
|
]
|
|
},
|
|
{% endfor %}
|
|
```
|
|
|
|
`partials/anthropic_image.jinja`:
|
|
```jinja
|
|
{
|
|
"type": "image",
|
|
"source":
|
|
{% if b.is_url %}
|
|
{ "type": "url", "url": {{ tojson(b.data) }} }
|
|
{% else %}
|
|
{ "type": "base64", "media_type": {{ tojson(b.media_type) }}, "data": {{ tojson(b.data) }} }
|
|
{% endif %}
|
|
},
|
|
```
|
|
|
|
`partials/google_contents.jinja`:
|
|
```jinja
|
|
{% for msg in ctx.history %}
|
|
{
|
|
"role": {% if msg.role == "assistant" %}"model"{% else %}"user"{% endif %},
|
|
"parts": [ {% for b in msg.content_blocks %}{% include "partials/google_part.jinja" %}{% endfor %} ]
|
|
},
|
|
{% endfor %}
|
|
```
|
|
|
|
`partials/google_part.jinja`:
|
|
```jinja
|
|
{% if b.type == "text" %}
|
|
{ "text": {{ tojson(b.text) }} },
|
|
{% else if b.type == "thinking" %}
|
|
{ "text": {{ tojson(b.thinking) }}, "thought": true, "thoughtSignature": {{ tojson(b.signature) }} },
|
|
{% else if b.type == "tool_use" %}
|
|
{ "functionCall": { "name": {{ tojson(b.name) }}, "args": {{ tojson(b.input) }} } },
|
|
{% else if b.type == "tool_result" %}
|
|
{ "functionResponse": { "name": {{ tojson(b.name) }}, "response": { "result": {{ tojson(b.content) }} } } },
|
|
{% else if b.type == "image" %}
|
|
{% if b.is_url %}
|
|
{ "file_data": { "mime_type": {{ tojson(b.media_type) }}, "file_uri": {{ tojson(b.data) }} } },
|
|
{% else %}
|
|
{ "inline_data": { "mime_type": {{ tojson(b.media_type) }}, "data": {{ tojson(b.data) }} } },
|
|
{% endif %}
|
|
{% else %}
|
|
{ "text": "" },
|
|
{% endif %}
|
|
```
|
|
|
|
## C++ work
|
|
|
|
In `JsonPromptTemplate`:
|
|
- Parse `[body]` as a `QJsonObject` (not a string). Walk it recursively and build the
|
|
request: render jinja-bearing string values via inja and splice the parsed JSON;
|
|
pass literal strings / scalars / inline-tables through; drop keys whose render is
|
|
empty.
|
|
- **Delete** `m_sampling`, `m_thinking`, and `applySampling` entirely — the body is
|
|
the request; there is no separate sampling/thinking merge.
|
|
- Drop the `thinkingEnabled` parameter from `buildFullRequest` /
|
|
`Provider::prepareRequest` / `Session` — it no longer affects rendering.
|
|
- Add a **JSON-aware** trailing-comma stripper before `QJsonDocument::fromJson`
|
|
(tracks string/escape state so `,}` / `,]` inside string values are not touched).
|
|
This is what lets partials emit an unconditional `,` after every element and drop
|
|
all `loop.is_last` bookkeeping.
|
|
|
|
In `AgentConfig` / `AgentLoader`:
|
|
- Replace `messageFormat` (string) with `body` (`QJsonObject`); merge `role` +
|
|
`context` into `system_prompt`. `[template].sampling` / `[template].thinking` are
|
|
removed.
|
|
- `extends` / `deepMerge` are unchanged; they now also merge `[body]`.
|
|
- Validate at load: a referenced partial must resolve; the assembled body must parse
|
|
as JSON (render once against a synthetic context with tool_use / tool_result /
|
|
image). Catches breakage at startup, not mid-conversation.
|
|
|
|
Model selection (per-agent override):
|
|
- `AgentFactory` owns an agentName → model map loaded from `agent_models.json`
|
|
(`loadModelOverrides`/`saveModelOverrides`). `create()`/`createFromFile()` apply the
|
|
override into the built `AgentConfig`; `effectiveModel()` exposes the resolved value;
|
|
`setModelOverride()` persists. The settings UI (`AgentDetailPane`) edits it via an
|
|
editable Model field; list/roster widgets display `effectiveModel`.
|
|
- `Session` substitutes `${MODEL}` in `cfg.endpoint` with the resolved model before
|
|
`sendRequest` (covers Google, whose model lives in the URL), and still seeds the
|
|
payload `model` from `cfg.model`. The provider keeps injecting `tools` when
|
|
`enable_tools` is set.
|
|
|
|
In `Session`:
|
|
- Collapse the `agent.role` + `agent.context` system-prompt layers into one rendered
|
|
`system_prompt` layer.
|
|
|
|
## Implementation order
|
|
|
|
1. JSON-aware trailing-comma stripper + whitelisted `include` resolver (enables
|
|
readable partials).
|
|
2. `[body]`-table model in `JsonPromptTemplate` + loader; delete
|
|
sampling/thinking/`applySampling`; drop `thinkingEnabled`.
|
|
3. `system_prompt` merge in loader + `Session`.
|
|
4. Per-agent model override in `AgentFactory` (`agent_models.json`) + `${MODEL}`
|
|
endpoint substitution in `Session`; editable Model field in settings; convert
|
|
bundled agents to the base/partials/`extends` layout.
|
|
5. Load-time validation (partial resolves, body parses).
|