4.0 KiB
Configure for Ollama
- Install Ollama. Make sure to review the system requirements before installation.
- Install a language models in Ollama via terminal. For example, you can run:
For standard computers (minimum 8GB RAM):
ollama run qwen2.5-coder:7b
For better performance (16GB+ RAM):
ollama run qwen2.5-coder:14b
For high-end systems (32GB+ RAM):
ollama run qwen2.5-coder:32b
- Open Qt Creator settings (Edit > Preferences on Linux/Windows, Qt Creator > Preferences on macOS)
- Navigate to the "QodeAssist" tab
- On the "General" page, verify:
- Ollama is selected as your LLM provider
- The URL is set to http://localhost:11434
- Your installed model appears in the model selection
- The prompt template is Ollama Auto FIM or Ollama Auto Chat for chat assistance. You can specify template if it is not work correct
- Disable using tools if your model doesn't support tooling
- Click Apply if you made any changes
You're all set! QodeAssist is now ready to use in Qt Creator.
Which models do I actually need?
You do not need a separate model for every agent. Each bundled Ollama agent names a default model only as an example — you can point any agent at a model you already have via its settings → Change… (a per-agent override; it does not edit the bundled agent). Seeing a model name on an agent is not a reason to download it.
The defaults cluster into a tiny set, so one or two pulls cover everyday use:
| Pull this | Unlocks |
|---|---|
qwen2.5-coder:7b |
Ollama Chat — Simple · Ollama Completion — FIM · Ollama Completion — Chat-style · Ollama Quick Refactor |
qwen3.5:9b (or :4b on ~8 GB) |
Ollama Chat — Thinking · Ollama Compression — 16/32 GB (:4b → Compression — 8 GB) |
Optional specialists — pull only if you want that capability:
| Pull this | For |
|---|---|
gemma4:12b |
Ollama Chat — Gemma 4 — agentic chat with vision + native reasoning |
theqtcompany/codellama-7b-qml |
Ollama Completion — QML (Qt) — Qt's QML-specific completion model |
Rule of thumb: pick the agent for the job, then either pull its named model or swap it (Change…) for one you already have.
Extended Thinking Mode
Ollama supports extended thinking mode for models that are capable of deep reasoning (such as DeepSeek-R1, QwQ, and similar reasoning models). This mode allows the model to show its step-by-step reasoning process before providing the final answer.
How to Enable
For Chat Assistant:
- Navigate to Qt Creator > Preferences > QodeAssist > Chat Assistant
- In the "Extended Thinking (Claude, Ollama)" section, check "Enable extended thinking mode"
- Select a reasoning-capable model (e.g., deepseek-r1:8b, qwq:32b)
- Click Apply
For Quick Refactoring:
- Navigate to Qt Creator > Preferences > QodeAssist > Quick Refactor
- Check "Enable Thinking Mode"
- Configure thinking budget and max tokens as needed
- Click Apply
Supported Models
Thinking mode works best with models specifically designed for reasoning:
- DeepSeek-R1 series (deepseek-r1:8b, deepseek-r1:14b, deepseek-r1:32b)
- QwQ series (qwq:32b)
- Other models trained for chain-of-thought reasoning
How It Works
When thinking mode is enabled:
- The model generates internal reasoning (visible in the chat as "Thinking" blocks)
- After reasoning, it provides the final answer
- You can collapse/expand thinking blocks to focus on the final answer
- Temperature is automatically set to 1.0 for optimal reasoning performance
Technical Details:
- Thinking mode adds the
enable_thinking: trueparameter to requests sent to Ollama - This is natively supported by the Ollama API for compatible models
- Works in both Chat Assistant and Quick Refactoring contexts