llamacpp_tools 0.3.0
llamacpp_tools: ^0.3.0 copied to clipboard
Tools to manage llama.cpp local setup (detecting, downloading or building, running).
Changelog #
0.3.0 #
Breaking changes:
LlamaserverConfig.flashAttentionis now an enumFlashAttention.
New feature:
LlamaserverSpecto supportProcessSwitcher(seepackage:process_visor) + lookup fromLlamaserverSpecRegistry(to implement an alternativellama-swap).- Supports detecting optimal parameters for models not fitting into VRAM (CLI or
ModelDetector).
0.2.0 #
Breaking changes:
- Github methods are moved into the
LlamacppGithubclass and renamed. - Docker-builder methods are moved into the
LlamacppDockerclass and renamed.
0.1.2 #
- Improved CUDA build (copying runtime libraries).
- Small improvements in
LlamacppDirand the server process.
0.1.1 #
- Added support for running the
llama-serverprocess.
0.1.0 #
- Inital release with downloading from GitHub and building CUDA-support with docker.