Ollama

ValorIDE can use Ollama as a local model provider for private, offline, or low-latency workflows. The best setup is usually a fast instruction model, streaming enabled, and a timeout long enough for first-token startup.

Quick Start

Install and start Ollama, then pull a model:

ollama serve
ollama pull mistral

Configure ValorIDE with the local Ollama endpoint:

{
  "apiProvider": "ollama",
  "ollamaBaseUrl": "http://localhost:11434",
  "ollamaModelId": "mistral",
  "ollamaRequestTimeout": "120000",
  "ollamaKeepAlive": "10m"
}

Model Selection

Start with a small or mid-sized model before moving to larger models.

Model	Typical Use	Resource Profile
`mistral`	General coding and chat	Fast on most developer machines
`neural-chat`	Lightweight instruction following	Good for lower-memory machines
`phi`	Small tasks and constrained hardware	Very small, lower capability ceiling
Larger Llama-family models	Higher-quality reasoning when hardware allows	Requires significantly more memory

The first response after loading a model may be slower. Keep-alive settings help subsequent requests stay responsive.

Timeout Strategy

ValorIDE streams Ollama responses and tolerates pauses between chunks. Use a larger request timeout when:

the model is large
the machine is memory constrained
the first token takes longer than expected
the prompt includes a large project context

Example:

{
  "ollamaRequestTimeout": "180000"
}

Troubleshooting

Connection Refused

Start the Ollama server:

ollama serve

Model Not Found

Pull the model before selecting it in ValorIDE:

ollama pull mistral

Slow First Response

The model may still be loading into memory. Retry after the first response, use a smaller model, or increase the request timeout.

Stream Pauses

Short pauses are expected with local inference. If pauses become frequent, reduce context size, choose a smaller model, or close other memory-heavy applications.

Related guides:

Quick Start​

Model Selection​

Timeout Strategy​

Troubleshooting​

Connection Refused​

Model Not Found​

Slow First Response​

Stream Pauses​