ChatOllamaOptions class
Options to pass into ChatOllama.
For a complete list of supported models and model variants, see the Ollama model library.
- Annotations
-
- @immutable
Constructors
-
ChatOllamaOptions({String? model, OllamaResponseFormat? format, int? keepAlive, int? numKeep, int? seed, int? numPredict, int? topK, double? topP, double? minP, double? tfsZ, double? typicalP, int? repeatLastN, double? temperature, double? repeatPenalty, double? presencePenalty, double? frequencyPenalty, int? mirostat, double? mirostatTau, double? mirostatEta, bool? penalizeNewline, List<
String> ? stop, bool? numa, int? numCtx, int? numBatch, int? numGpu, int? mainGpu, bool? lowVram, bool? f16KV, bool? logitsAll, bool? vocabOnly, bool? useMmap, bool? useMlock, int? numThread, List<ToolSpec> ? tools, ChatToolChoice? toolChoice, int concurrencyLimit = 1000}) -
Options to pass into ChatOllama.
const
Properties
- concurrencyLimit → int
-
The maximum number of concurrent calls that the runnable can make.
Defaults to 1000 (different Runnable types may have different defaults).
finalinherited
- f16KV → bool?
-
Enable f16 key/value.
(Default: true)
final
- format → OllamaResponseFormat?
-
The format to return a response in. Currently the only accepted value is
json.
final
- frequencyPenalty → double?
-
Positive values penalize new tokens based on their existing frequency in
the text so far, decreasing the model's likelihood to repeat the same
line verbatim.
final
- hashCode → int
-
The hash code for this object.
no setter
- keepAlive → int?
-
How long (in minutes) to keep the model loaded in memory.
final
- logitsAll → bool?
-
Enable logits all.
(Default: false)
final
- lowVram → bool?
-
Enable low VRAM mode.
(Default: false)
final
- mainGpu → int?
-
The GPU to use for the main model.
(Default: 0)
final
- minP → double?
-
Alternative to the topP, and aims to ensure a balance of quality and
variety. minP represents the minimum probability for a token to be
considered, relative to the probability of the most likely token. For
example, with min_p=0.05 and the most likely token having a probability
of 0.9, logits with a value less than 0.05*0.9=0.045 are filtered out.
(Default: 0.0)
final
- mirostat → int?
-
Enable Mirostat sampling for controlling perplexity.
(default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
final
- mirostatEta → double?
-
Influences how quickly the algorithm responds to feedback from the
generated text. A lower learning rate will result in slower adjustments,
while a higher learning rate will make the algorithm more responsive.
(Default: 0.1)
final
- mirostatTau → double?
-
Controls the balance between coherence and diversity of the output. A
lower value will result in more focused and coherent text.
(Default: 5.0)
final
- model → String?
-
ID of the language model to use.
Check the provider's documentation for available models.
finalinherited
- numa → bool?
-
Enable NUMA support.
(Default: false)
final
- numBatch → int?
-
Sets the number of batches to use for generation.
(Default: 1)
final
- numCtx → int?
-
Sets the size of the context window used to generate the next token.
final
- numGpu → int?
-
The number of layers to send to the GPU(s). On macOS it defaults to 1 to
enable metal support, 0 to disable.
final
- numKeep → int?
-
Number of tokens to keep from the prompt.
(Default: 0)
final
- numPredict → int?
-
Maximum number of tokens to predict when generating text.
(Default: 128, -1 = infinite generation, -2 = fill context)
final
- numThread → int?
-
Sets the number of threads to use during computation. By default, Ollama
will detect this for optimal performance. It is recommended to set this
value to the number of physical CPU cores your system has (as opposed to
the logical number of cores).
final
- penalizeNewline → bool?
-
Penalize newlines in the output.
(Default: true)
final
- presencePenalty → double?
-
Positive values penalize new tokens based on whether they appear in the
text so far, increasing the model's likelihood to talk about new topics.
final
- repeatLastN → int?
-
Sets how far back for the model to look back to prevent repetition.
(Default: 64, 0 = disabled, -1 = num_ctx)
final
- repeatPenalty → double?
-
Sets how strongly to penalize repetitions. A higher value (e.g., 1.5)
will penalize repetitions more strongly, while a lower value (e.g., 0.9)
will be more lenient.
(Default: 1.1)
final
- runtimeType → Type
-
A representation of the runtime type of the object.
no setterinherited
- seed → int?
-
Sets the random number seed to use for generation. Setting this to a
specific number will make the model generate the same text for the same
prompt.
(Default: 0)
final
-
stop
→ List<
String> ? -
Sequences where the API will stop generating further tokens. The returned
text will not contain the stop sequence.
final
- temperature → double?
-
The temperature of the model. Increasing the temperature will make the
model answer more creatively.
(Default: 0.8)
final
- tfsZ → double?
-
Tail free sampling is used to reduce the impact of less probable tokens
from the output. A higher value (e.g., 2.0) will reduce the impact more,
while a value of 1.0 disables this setting.
(default: 1)
final
- toolChoice → ChatToolChoice?
-
Controls which (if any) tool is called by the model.
finalinherited
-
tools
→ List<
ToolSpec> ? -
A list of tools the model may call.
finalinherited
- topK → int?
-
Reduces the probability of generating nonsense. A higher value (e.g. 100)
will give more diverse answers, while a lower value (e.g. 10) will be
more conservative.
(Default: 40)
final
- topP → double?
-
Works together with topK. A higher value (e.g., 0.95) will lead to more
diverse text, while a lower value (e.g., 0.5) will generate more focused
and conservative text.
(Default: 0.9)
final
- typicalP → double?
-
Typical p is used to reduce the impact of less probable tokens from the
output.
(Default: 1.0)
final
- useMlock → bool?
-
Enable mlock.
(Default: false)
final
- useMmap → bool?
-
Enable mmap.
(Default: false)
final
- vocabOnly → bool?
-
Enable vocab only.
(Default: false)
final
Methods
-
copyWith(
{String? model, OllamaResponseFormat? format, int? keepAlive, int? numKeep, int? seed, int? numPredict, int? topK, double? topP, double? minP, double? tfsZ, double? typicalP, int? repeatLastN, double? temperature, double? repeatPenalty, double? presencePenalty, double? frequencyPenalty, int? mirostat, double? mirostatTau, double? mirostatEta, bool? penalizeNewline, List< String> ? stop, bool? numa, int? numCtx, int? numBatch, int? numGpu, int? mainGpu, bool? lowVram, bool? f16KV, bool? logitsAll, bool? vocabOnly, bool? useMmap, bool? useMlock, int? numThread, List<ToolSpec> ? tools, ChatToolChoice? toolChoice, int? concurrencyLimit}) → ChatOllamaOptions -
Creates a copy of this
RunnableOptions
with the given fields replaced by the new values. -
merge(
covariant ChatOllamaOptions? other) → ChatOllamaOptions -
Merges this
RunnableOptions
with anotherRunnableOptions
. -
noSuchMethod(
Invocation invocation) → dynamic -
Invoked when a nonexistent method or property is accessed.
inherited
-
toString(
) → String -
A string representation of this object.
inherited
Operators
-
operator ==(
covariant ChatOllamaOptions other) → bool - The equality operator.