LlmInferenceOptions class

Configuration object for LLM inference.

This io-friendly implementation is not immutable to track whether the native memory has been created and ultimately released. All values used by pkg:equatable are in fact immutable.



LlmInferenceOptions.cpu({required String modelPath, required String cacheDir, required int maxTokens, required double temperature, required int topK, int? randomSeed})
LlmInferenceOptions.gpu({required String modelPath, required int sequenceBatchSize, required int maxTokens, required double temperature, required int topK, int decodeStepsPerSync = 3, int? randomSeed})


cacheDir String
Directory path for storing model related tokenizer and cache weights. The user is responsible for providing the directory that can be writable by the program. Used by CPU only.
decodeStepsPerSync int
Number of decode steps per sync. Used by GPU only. The default value is 3.

The hash code for this object.
isClosed bool
Tracks whether dispose has been called.
no setter
loraPath String
Path to the LoRA tflite flatbuffer file. Optional (default is empty string). This is only compatible with GPU models.
maxTokens int
The total length of the kv-cache.
modelPath String
The path that points to the tflite model file to use for inference.
props List<Object?>
The list of properties that will be used to determine whether two instances are equal.
randomSeed int
Random seed for sampling tokens.

A representation of the runtime type of the object.
sequenceBatchSize int
Sequence batch size for encoding. Used by GPU only. Number of input tokens to process at a time for batch processing. Setting this value to 1 means both the encoding and decoding share the same graph of sequence length of 1. Setting this value to 0 means the batch size will be optimized programmatically.

If set to true, the toString method will be overridden to output this instance's props.
temperature double
Randomness when decoding the next token.
topK int
Top K number of tokens to be sampled from for each decoding step.


copyToNative() Pointer<LlmSessionConfig>
Copies this options object into native memory for use by an engine.
dispose() → void
Releases the native memory behind this options object.
