Ollama Dart Client

Unofficial Dart client for Ollama API.

Features

Fully type-safe, documented and tested
All platforms supported (including streaming on web)
Custom base URL, headers and query params support (e.g. HTTP proxies)
Custom HTTP client support (e.g. SOCKS5 proxies or advanced use cases)

Supported endpoints:

Completions (with streaming support)
Chat completions
Embeddings
Models
Blobs

Usage
Advance Usage
Acknowledgements
License

Usage

Refer to the documentation for more information about the API.

Completions

Given a prompt, the model will generate a response.

Generate completion:

final generated = await client.generateCompletion(
  request: GenerateCompletionRequest(
    model: 'mistral:latest',
    prompt: 'Why is the sky blue?',
  ),
);
print(generated.response);
// The sky appears blue because of a phenomenon called Rayleigh scattering...

Stream completion:

final stream = client.generateCompletionStream(
  request: GenerateCompletionRequest(
    model: 'mistral:latest',
    prompt: 'Why is the sky blue?',
  ),
);
String text = '';
await for (final res in stream) {
  text += res.response?.trim() ?? '';
}
print(text);
// The sky appears blue because of a phenomenon called Rayleigh scattering...

Chat completions

Given a prompt, the model will generate a response in a chat format.

Generate chat completion:

final res = await client.generateChatCompletion(
  request: GenerateChatCompletionRequest(
    model: defaultModel,
    messages: [
      Message(
        role: MessageRole.system,
        content: 'You are a helpful assistant.',
      ),
      Message(
        role: MessageRole.user,
        content: 'List the numbers from 1 to 9 in order.',
      ),
    ],
    keepAlive: 1,
  ),
);
print(res);
// Message(role: MessageRole.assistant, content: 123456789)

Stream chat completion:

final stream = client.generateChatCompletionStream(
  request: GenerateChatCompletionRequest(
    model: defaultModel,
    messages: [
      Message(
        role: MessageRole.system,
        content: 'You are a helpful assistant.',
      ),
      Message(
        role: MessageRole.user,
        content: 'List the numbers from 1 to 9 in order.',
      ),
    ],
    keepAlive: 1,
  ),
);

String text = '';
await for (final res in stream) {
  text += (res.message?.content ?? '').trim();
}
print(text);
// 123456789

Embeddings

Given a prompt, the model will generate an embedding representing the prompt.

Generate embedding:

final generated = await client.generateEmbedding(
  request: GenerateEmbeddingRequest(
    model: 'mistral:latest',
    prompt: 'Here is an article about llamas...',
  ),
);
print(generated.embedding);
// [8.566641807556152, 5.315540313720703, ...]

Models

Create model

Creates a new local model using a modelfile.

await client.createModel(
  request: CreateModelRequest(
    model: 'mario',
    modelfile: 'FROM mistral:latest\nSYSTEM You are mario from Super Mario Bros.',
  ),
);

You can also stream the status of the model creation:

final stream = client.createModelStream(
  request: CreateModelRequest(
    model: 'mario',
    modelfile: 'FROM mistral:latest\nSYSTEM You are mario from Super Mario Bros.',
  ),
);
await for (final res in stream) {
  print(res.status);
}

List models

List models that are available locally.

final res = await client.listModels();
print(res.models);

Show Model Information

Show details about a model including modelfile, template, parameters, license, and system prompt.

final res = await client.showModelInfo(
  request: ModelInfoRequest(model: 'mistral:latest'),
);
print(res);

Pull a Model

Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.

final res = await client.pullModel(
  request: PullModelRequest(model: 'yarn-llama3:13b-128k-q4_1'),
);
print(res.status);

You can also stream the pulling status:

final stream = client.pullModelStream(
  request: PullModelRequest(model: 'yarn-llama3:13b-128k-q4_1'),
);
await for (final res in stream) {
  print(res.status);
}

Push a Model

Upload a model to a model library.

Requires registering for ollama.ai and adding a public key first.

final res = await client.pushModel(
  request: PushModelRequest(model: 'mattw/pygmalion:latest'),
);
print(res.status);

You can also stream the pushing status:

final stream = client.pushModelStream(
  request: PushModelRequest(model: 'mattw/pygmalion:latest'),
);
await for (final res in stream) {
  print(res.status);
}

Check if a Blob Exists

Check if a blob is known to the server.

await client.checkBlob(
  name: 'sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2',
);

If the blob doesn't exist, an OllamaClientException exception will be thrown.

Advance Usage

Default HTTP client

By default, the client uses http://localhost:11434/api as the baseUrl and the following implementations of http.Client:

Non-web: IOClient
Web: FetchClient (to support streaming on web)

Custom HTTP client

You can always provide your own implementation of http.Client for further customization:

final client = OllamaClient(
  client: MyHttpClient(),
);

Using a proxy

HTTP proxy

You can use your own HTTP proxy by overriding the baseUrl and providing your required headers:

final client = OllamaClient(
  baseUrl: 'https://my-proxy.com',
  headers: {
      'x-my-proxy-header': 'value',
  },
);

If you need further customization, you can always provide your own http.Client.

SOCKS5 proxy

To use a SOCKS5 proxy, you can use the socks5_proxy package:

final baseHttpClient = HttpClient();
SocksTCPClient.assignToHttpClient(baseHttpClient, [
  ProxySettings(InternetAddress.loopbackIPv4, 1080),
]);
final httpClient = IOClient(baseClient);

final client = OllamaClient(
  client: httpClient,
);

Acknowledgements

The generation of this client was made possible by the openapi_spec package.

License

Ollama Dart Client is licensed under the MIT License.