llama_set_n_threads method

void llama_set_n_threads(
  1. Pointer<llama_context> ctx,
  2. int n_threads,
  3. int n_threads_batch
)

Set the number of threads used for decoding n_threads is the number of threads used for generation (single token) n_threads_batch is the number of threads used for prompt and batch processing (multiple tokens)

Implementation

void llama_set_n_threads(
  ffi.Pointer<llama_context> ctx,
  int n_threads,
  int n_threads_batch,
) {
  return _llama_set_n_threads(
    ctx,
    n_threads,
    n_threads_batch,
  );
}