ollama_dart 2.0.0
ollama_dart: ^2.0.0 copied to clipboard
Dart client for the Ollama API to run LLMs locally (OpenAI gpt-oss, DeepSeek-R1, Gemma 3, Llama 4, and more).
Ollama Dart Client #
Dart client for the Ollama API to run local and self-hosted models — chat, streaming, tool calling, embeddings, and model management. It gives Dart and Flutter applications a pure Dart, type-safe client across iOS, Android, macOS, Windows, Linux, Web, and server-side Dart.
Tip
Coding agents: start with llms.txt. It links to the package docs, examples, and optional references in a compact format.
Table of Contents
Features #
Generation and streaming #
- Chat completions with context memory and multimodal inputs
- Text generation for prompt-style completions
- Embeddings for semantic search and retrieval
- NDJSON streaming for chat and completions
- Tool calling, thinking mode, and structured output
Local model operations #
- Pull, push, copy, create, delete, and inspect models
- List running models and query server version
- Connect to local or remote Ollama instances with optional auth
Why choose this client? #
- Pure Dart with no Flutter dependency — works in mobile apps, backends, and CLIs.
- Type-safe request and response models with minimal dependencies (
http,logging,meta). - Streaming, retries, interceptors, and error handling built into the client.
- Mirrors the Ollama API closely, including model management endpoints most wrappers skip.
Quickstart #
dependencies:
ollama_dart: ^2.0.0
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final response = await client.chat.create(
request: ChatRequest(
model: 'gpt-oss',
messages: [ChatMessage.user('Explain what Dart isolates do.')],
),
);
print(response.message?.content);
} finally {
client.close();
}
}
Configuration #
Configure local hosts, remote servers, and retries
Use OllamaClient() for the default local daemon at http://localhost:11434, or OllamaClient.fromEnvironment() to read OLLAMA_HOST. Use OllamaConfig when you need a remote host, bearer auth, or a different timeout policy.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient(
config: OllamaConfig(
baseUrl: 'http://localhost:11434',
timeout: const Duration(minutes: 5),
retryPolicy: RetryPolicy(
maxRetries: 3,
initialDelay: Duration(seconds: 1),
),
),
);
client.close();
}
Environment variable:
OLLAMA_HOST
Use BearerTokenProvider when the Ollama server is exposed behind an authenticated reverse proxy or remote deployment.
Usage #
How do I run a chat completion? #
Show example
Use client.chat.create(...) for conversational flows. The chat response exposes message?.content, which keeps simple completions ergonomic in Dart and Flutter UIs.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final response = await client.chat.create(
request: ChatRequest(
model: 'gpt-oss',
messages: [
ChatMessage.system('You are a concise assistant.'),
ChatMessage.user('What is hot reload?'),
],
),
);
print(response.message?.content);
} finally {
client.close();
}
}
For structured output, set format to constrain the response to valid JSON:
final response = await client.chat.create(
request: ChatRequest(
model: 'gpt-oss',
messages: [ChatMessage.user('List 3 colors as JSON')],
format: ResponseFormat.json,
),
);
How do I stream local model output? #
Show example
Streaming uses Ollama's NDJSON response format and works well for terminals and live Flutter widgets. This is the fastest way to surface partial output from a local model.
import 'dart:io';
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final stream = client.chat.createStream(
request: ChatRequest(
model: 'gpt-oss',
messages: [ChatMessage.user('Write a haiku about local models.')],
),
);
await for (final chunk in stream) {
stdout.write(chunk.message?.content ?? '');
}
} finally {
client.close();
}
}
How do I use tool calling? #
Show example
Tool calling is declared on the request with typed ToolDefinition objects. This makes local agent-style workflows possible without switching to another API format.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final response = await client.chat.create(
request: ChatRequest(
model: 'gpt-oss',
messages: [ChatMessage.user('What is the weather in Paris?')],
tools: [
ToolDefinition(
type: ToolType.function,
function: ToolFunction(
name: 'get_weather',
description: 'Get the current weather for a location',
parameters: {
'type': 'object',
'properties': {
'location': {'type': 'string'},
},
'required': ['location'],
},
),
),
],
),
);
print(response.message?.toolCalls?.length ?? 0);
} finally {
client.close();
}
}
How do I generate plain text? #
Show example
Use the completions resource when you want prompt-style generation instead of chat messages. This is useful for legacy templates, code infill helpers, or smaller server utilities.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final result = await client.completions.generate(
request: GenerateRequest(
model: 'gpt-oss',
prompt: 'Complete this sentence: Dart is great for',
),
);
print(result.response);
} finally {
client.close();
}
}
How do I create embeddings? #
Show example
Embeddings are exposed as a first-class resource, so semantic search or retrieval code can stay inside the same Ollama client. This is useful for local RAG pipelines in Dart.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final response = await client.embeddings.create(
request: const EmbedRequest(
model: 'nomic-embed-text',
input: EmbedInput.list(['Dart', 'Flutter']),
),
);
print(response.embeddings.length);
} finally {
client.close();
}
}
How do I manage local models? #
Show example
Model management is part of the same client, which means pull, inspect, and runtime checks do not require a separate admin tool. That is useful for installers, desktop apps, and local dev tooling.
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
final models = await client.models.list();
print(models.models.length);
} finally {
client.close();
}
}
Error Handling #
Handle local daemon failures, retries, and streaming issues
ollama_dart throws typed exceptions so you can distinguish between API failures, timeouts, aborts, and streaming problems. Catch ApiException first for HTTP errors, then fall back to OllamaException for everything else.
import 'dart:io';
import 'package:ollama_dart/ollama_dart.dart';
Future<void> main() async {
final client = OllamaClient();
try {
await client.version.get();
} on ApiException catch (error) {
stderr.writeln('Ollama API error ${error.statusCode}: ${error.message}');
} on OllamaException catch (error) {
stderr.writeln('Ollama client error: $error');
} finally {
client.close();
}
}
Examples #
See the example/ directory for complete examples:
| Example | Description |
|---|---|
chat_example.dart |
Chat completions |
streaming_example.dart |
Streaming responses |
tool_calling_example.dart |
Tool calling |
completions_example.dart |
Plain text generation |
embeddings_example.dart |
Text embeddings |
models_example.dart |
Model management |
version_example.dart |
Server version |
error_handling_example.dart |
Exception handling patterns |
ollama_dart_example.dart |
Quick-start overview |
API Coverage #
| API | Status |
|---|---|
| Chat | ✅ Full |
| Completions | ✅ Full |
| Embeddings | ✅ Full |
| Models | ✅ Full |
| Version | ✅ Full |
Official Documentation #
Sponsor #
If these packages are useful to you or your company, please consider sponsoring the project. Development and maintenance are provided to the community for free, but integration tests against real APIs and the tooling required to build and verify releases still have real costs. Your support, at any level, helps keep these packages maintained and free for the Dart & Flutter community.
License #
This package is licensed under the MIT License.
This is a community-maintained package and is not affiliated with or endorsed by Ollama.