llama_model_quantize_params class final

model quantization parameters

Inheritance
Implemented types
Available extensions

Properties

allow_requantize bool
allow quantizing non-f32/f16 tensors
getter/setter pair
ftype int
quantize to this llama_ftype
getter/setter pair
hashCode int
The hash code for this object.
no setterinherited
nthread int
number of threads to use for quantizing, if <=0 will use std::thread::hardware_concurrency()
getter/setter pair
only_copy bool
only copy tensors - ftype, allow_requantize and quantize_output_tensor are ignored
getter/setter pair
pure bool
disable k-quant mixtures and quantize all tensors to the same type
getter/setter pair
quantize_output_tensor bool
quantize output.weight
getter/setter pair
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited