Files
QodeAssist/docs/ollama-configuration.md
2026-07-02 22:26:32 +02:00

3.1 KiB

Configure for Ollama

  1. Install Ollama. Make sure to review the system requirements before installation.
  2. Install a language models in Ollama via terminal. For example, you can run:

For standard computers (minimum 8GB RAM):

ollama run qwen2.5-coder:7b

For better performance (16GB+ RAM):

ollama run qwen2.5-coder:14b

For high-end systems (32GB+ RAM):

ollama run qwen2.5-coder:32b
  1. Open Qt Creator settings (Edit > Preferences on Linux/Windows, Qt Creator > Preferences on macOS)
  2. Navigate to QodeAssist > Providers and verify the bundled Ollama (Native) provider points at http://localhost:11434 (no API key needed)
  3. Navigate to QodeAssist > General > Agent Pipelines — the Ollama agents are assigned by default:
    • Code completion: Ollama Completion — FIM (needs a base/FIM model; use Ollama Completion — Chat-style for instruct models)
    • Chat assistant: Ollama Chat — Simple / Thinking / Gemma 4
    • Chat compression: Ollama Compression — 8 GB (or the 16/32 GB tier for your machine)
    • Quick refactor: Ollama Quick Refactor — Simple
  4. Point the agents at models you actually have (see the next section)

You're all set! QodeAssist is now ready to use in Qt Creator.

Which models do I actually need?

You do not need a separate model for every agent. Each bundled Ollama agent names a default model only as an example — you can point any agent at a model you already have via its settings → Change… (a per-agent override; it does not edit the bundled agent). Seeing a model name on an agent is not a reason to download it.

The defaults cluster into a tiny set, so one or two pulls cover everyday use:

Pull this Unlocks
qwen2.5-coder:7b Ollama Chat — Simple · Ollama Completion — FIM · Ollama Completion — Chat-style · Ollama Quick Refactor
qwen3.5:9b (or :4b on ~8 GB) Ollama Chat — Thinking · Ollama Compression — 16/32 GB (:4b → Compression — 8 GB)

Optional specialists — pull only if you want that capability:

Pull this For
gemma4:12b Ollama Chat — Gemma 4 — agentic chat with vision + native reasoning
theqtcompany/codellama-7b-qml Ollama Completion — QML (Qt) — Qt's QML-specific completion model

Rule of thumb: pick the agent for the job, then either pull its named model or swap it (Change…) for one you already have.

Extended Thinking Mode

Ollama supports native reasoning for models trained for it (Qwen3.5, Gemma 4, DeepSeek-R1, QwQ, …). Reasoning is streamed into collapsible "Thinking" blocks in the chat.

Thinking is a property of the agent, not a global switch:

  • For chat, pick a thinking agent in the chat panel: Ollama Chat — Thinking or Ollama Chat — Gemma 4
  • For quick refactor, assign Ollama Quick Refactor — Qwen3.5 or — Gemma 4 in Agent Pipelines
  • In your own agents, set think = true in the [body] table (top level, not under [body.options])

Use a reasoning-capable model with these agents — a non-reasoning model simply ignores the flag.