Changelog #

0.2.0 - 2025-01-21 #

Added #

Dart CLI Build Tool (dart_llama_tool)
- global-install - One-command global installation: builds, installs library, activates globally
- setup - Complete setup: builds llama.cpp, wrapper, regenerates FFI bindings, downloads model, runs tests
- build - Build both llama.cpp and wrapper libraries
- build-llama - Build only llama.cpp library (supports --static flag)
- build-wrapper - Build only the wrapper library (supports --static flag)
- compile - Compile CLI tools to native executables with bundled libraries
- install-lib - Install wrapper library to ~/.dart_llama/ for global CLI usage
- ffigen - Regenerate FFI bindings
- download-model - Download Gemma 3 1B model for testing
- clean - Remove built libraries, static libraries, dist folder, and llama.cpp source
Static Linking Support
- build-llama --static builds llama.cpp as a static library
- build-wrapper --static links wrapper with llama.cpp statically
- Single libllama_wrapper.dylib contains all code for easier distribution
- No dependency on separate libllama.dylib when using static linking
Native Executable Compilation
- compile command creates dist/ folder with native executables
- Bundles executables with required dynamic library
Globally Installable CLI Tools
- ldcompletion - Text completion CLI with streaming support
- ldchat - Interactive Gemma chat CLI with streaming support
Context Management
- clearContext() method to reset KV cache between chat turns
- Fixes context overflow error in long conversations
Typed Exception Hierarchy
- LlamaException - sealed base class for all llama-related errors
- ModelLoadException - thrown when model fails to load
- ContextCreationException - thrown when context creation fails
- TokenizationException - thrown when tokenization fails
- PromptTooLongException - thrown when prompt exceeds limits
- ContextOverflowException - thrown when context window fills up
- DecodeException - thrown when decoding fails
- Enables pattern matching for error handling instead of string matching
Comprehensive Test Suite
- Unit tests for all data models and exception classes
- Tests for LlamaModel lifecycle and error handling
- Tests for clearContext() method
- Tests for stop sequences in both generate and stream modes

Changed #

Project Structure Reorganization
- Moved llama_wrapper.c and llama_wrapper.h to native/ directory
- Updated build scripts and ffigen configuration for new paths
llama.cpp Management
- Pinned llama.cpp to version b7783 for reproducible builds
- llama.cpp is now fetched during build (not stored in repo)
- Uses shallow clone for faster downloads
Build System
- Replaced bash scripts with Dart CLI tool using args package
- Removed scripts/ directory (bash scripts)

0.1.2 - 2025-08-05 #

Changed #

Renamed example/completion.dart to example/main.dart for better pub.dev scoring
Added comprehensive documentation for GenerationResponse class and all its properties

0.1.1 - 2025-08-04 #

Documentation #

Added comprehensive documentation comments to all public API elements
Added library-level documentation with examples and getting started guide
Documented all GenerationRequest parameters with usage guidance
Documented all LlamaConfig parameters with recommendations
Improved overall API documentation coverage from 26.5% to 100%

0.1.0 - 2025-08-04 #

Initial Release #

Core Features
- FFI-based Dart bindings for llama.cpp
- Low-level LlamaModel API for direct text generation control
- Support for loading GGUF model files
- Automatic memory management with proper cleanup
- Real-time streaming support with token-by-token generation
- Configurable stop sequences for controlling generation boundaries
API Features
- LlamaModel - Main class for model initialization and text generation
- GenerationRequest - Configurable generation parameters
- GenerationResponse - Detailed generation results with token counts
- Streaming and non-streaming generation modes
- Temperature, top-p, top-k, and repeat penalty sampling controls
- Random seed support for reproducible generation
- Token counting and generation time tracking
Stop Sequence Support
- Configurable stop sequences in GenerationRequest
- Proper handling of stop sequences split across multiple tokens
- Automatic trimming of stop sequences from output
- Works correctly in both streaming and non-streaming modes
Memory Management
- Fixed double-free error in sampler disposal
- Proper lifecycle management for all native resources
- RAII pattern with reliable dispose() methods
Examples
- example/completion.dart - Simple text completion with streaming support
- example/gemma_chat.dart - Full Gemma chat implementation with proper formatting
Developer Experience
- Automated FFI binding generation with ffigen
- Comprehensive build scripts for llama.cpp compilation
- Model download script for testing (Gemma 3 1B)
- Unit and integration tests
- Code quality enforcement with very_good_analysis
Platform Support
- macOS (ARM64 and x86_64) with Metal acceleration
- Linux (x86_64)
- Windows (x86_64) - untested

dart_llama 0.2.0
dart_llama: ^0.2.0 copied to clipboard

Metadata

Changelog #

0.2.0 - 2025-01-21 #

Added #

Changed #

0.1.2 - 2025-08-05 #

Changed #

0.1.1 - 2025-08-04 #

Documentation #

0.1.0 - 2025-08-04 #

Initial Release #

← Metadata

Publisher

Weekly Downloads

Metadata

Documentation

License

Dependencies

More

dart_llama 0.2.0 dart_llama: ^0.2.0 copied to clipboard

Metadata

Changelog #

0.2.0 - 2025-01-21 #

Added #

Changed #

0.1.2 - 2025-08-05 #

Changed #

0.1.1 - 2025-08-04 #

Documentation #

0.1.0 - 2025-08-04 #

Initial Release #

← Metadata

Publisher

Weekly Downloads

Metadata

Documentation

License

Dependencies

More

dart_llama 0.2.0
dart_llama: ^0.2.0 copied to clipboard