Ollama Dart Client

tests ollama_dart MIT

Unofficial Dart client for Ollama API.

Features

  • Fully type-safe, documented and tested
  • All platforms supported (including streaming on web)
  • Custom base URL, headers and query params support (e.g. HTTP proxies)
  • Custom HTTP client support (e.g. SOCKS5 proxies or advanced use cases)

Supported endpoints:

  • Completions (with streaming support)
  • Chat completions
  • Embeddings
  • Models
  • Blobs

Table of contents

Usage

Refer to the documentation for more information about the API.

Completions

Given a prompt, the model will generate a response.

Generate completion:

final generated = await client.generateCompletion(
  request: GenerateCompletionRequest(
    model: 'mistral:latest',
    prompt: 'Why is the sky blue?',
  ),
);
print(generated.response);
// The sky appears blue because of a phenomenon called Rayleigh scattering...

Stream completion:

final stream = client.generateCompletionStream(
  request: GenerateCompletionRequest(
    model: 'mistral:latest',
    prompt: 'Why is the sky blue?',
  ),
);
String text = '';
await for (final res in stream) {
  text += res.response?.trim() ?? '';
}
print(text);
// The sky appears blue because of a phenomenon called Rayleigh scattering...

Chat completions

Given a prompt, the model will generate a response in a chat format.

Generate chat completion:

final res = await client.generateChatCompletion(
  request: GenerateChatCompletionRequest(
    model: defaultModel,
    messages: [
      Message(
        role: MessageRole.system,
        content: 'You are a helpful assistant.',
      ),
      Message(
        role: MessageRole.user,
        content: 'List the numbers from 1 to 9 in order.',
      ),
    ],
    keepAlive: 1,
  ),
);
print(res);
// Message(role: MessageRole.assistant, content: 123456789)

Stream chat completion:

final stream = client.generateChatCompletionStream(
  request: GenerateChatCompletionRequest(
    model: defaultModel,
    messages: [
      Message(
        role: MessageRole.system,
        content: 'You are a helpful assistant.',
      ),
      Message(
        role: MessageRole.user,
        content: 'List the numbers from 1 to 9 in order.',
      ),
    ],
    keepAlive: 1,
  ),
);

String text = '';
await for (final res in stream) {
  text += (res.message?.content ?? '').trim();
}
print(text);
// 123456789

Embeddings

Given a prompt, the model will generate an embedding representing the prompt.

Generate embedding:

final generated = await client.generateEmbedding(
  request: GenerateEmbeddingRequest(
    model: 'mistral:latest',
    prompt: 'Here is an article about llamas...',
  ),
);
print(generated.embedding);
// [8.566641807556152, 5.315540313720703, ...]

Models

Create model

Creates a new local model using a modelfile.

await client.createModel(
  request: CreateModelRequest(
    model: 'mario',
    modelfile: 'FROM mistral:latest\nSYSTEM You are mario from Super Mario Bros.',
  ),
);

You can also stream the status of the model creation:

final stream = client.createModelStream(
  request: CreateModelRequest(
    model: 'mario',
    modelfile: 'FROM mistral:latest\nSYSTEM You are mario from Super Mario Bros.',
  ),
);
await for (final res in stream) {
  print(res.status);
}

List models

List models that are available locally.

final res = await client.listModels();
print(res.models);

Show Model Information

Show details about a model including modelfile, template, parameters, license, and system prompt.

final res = await client.showModelInfo(
  request: ModelInfoRequest(model: 'mistral:latest'),
);
print(res);

Pull a Model

Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.

final res = await client.pullModel(
  request: PullModelRequest(model: 'yarn-llama3:13b-128k-q4_1'),
);
print(res.status);

You can also stream the pulling status:

final stream = client.pullModelStream(
  request: PullModelRequest(model: 'yarn-llama3:13b-128k-q4_1'),
);
await for (final res in stream) {
  print(res.status);
}

Push a Model

Upload a model to a model library.

Requires registering for ollama.ai and adding a public key first.

final res = await client.pushModel(
  request: PushModelRequest(model: 'mattw/pygmalion:latest'),
);
print(res.status);

You can also stream the pushing status:

final stream = client.pushModelStream(
  request: PushModelRequest(model: 'mattw/pygmalion:latest'),
);
await for (final res in stream) {
  print(res.status);
}

Check if a Blob Exists

Check if a blob is known to the server.

await client.checkBlob(
  name: 'sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2',
);

If the blob doesn't exist, an OllamaClientException exception will be thrown.

Advance Usage

Default HTTP client

By default, the client uses http://localhost:11434/api as the baseUrl and the following implementations of http.Client:

Custom HTTP client

You can always provide your own implementation of http.Client for further customization:

final client = OllamaClient(
  client: MyHttpClient(),
);

Using a proxy

HTTP proxy

You can use your own HTTP proxy by overriding the baseUrl and providing your required headers:

final client = OllamaClient(
  baseUrl: 'https://my-proxy.com',
  headers: {
      'x-my-proxy-header': 'value',
  },
);

If you need further customization, you can always provide your own http.Client.

SOCKS5 proxy

To use a SOCKS5 proxy, you can use the socks5_proxy package:

final baseHttpClient = HttpClient();
SocksTCPClient.assignToHttpClient(baseHttpClient, [
  ProxySettings(InternetAddress.loopbackIPv4, 1080),
]);
final httpClient = IOClient(baseClient);

final client = OllamaClient(
  client: httpClient,
);

Acknowledgements

The generation of this client was made possible by the openapi_spec package.

License

Ollama Dart Client is licensed under the MIT License.

Libraries

ollama_dart
Dart Client for the Ollama API (run Llama 3, Code Llama, and other models locally).