llama_model_quantize_params class

llama_model_quantize_params class final

model quantization parameters

Inheritance

Implemented types

Available extensions

Constructors

address → Pointer<T>: Available on T, provided by the StructAddress extension
The memory address of the underlying data.
no setter
allow_requantize ↔ bool: allow quantizing non-f32/f16 tensors
getter/setter pair
ftype ↔ int: quantize to this llama_ftype
getter/setter pair
hashCode → int: The hash code for this object.
no setterinherited
nthread ↔ int: number of threads to use for quantizing, if <=0 will use std::thread::hardware_concurrency()
getter/setter pair
only_copy ↔ bool: only copy tensors - ftype, allow_requantize and quantize_output_tensor are ignored
getter/setter pair
pure ↔ bool: disable k-quant mixtures and quantize all tensors to the same type
getter/setter pair
quantize_output_tensor ↔ bool: quantize output.weight
getter/setter pair
runtimeType → Type: A representation of the runtime type of the object.
no setterinherited

noSuchMethod(Invocation invocation) → dynamic: Invoked when a nonexistent method or property is accessed.
inherited
toString() → String: A string representation of this object.
inherited