flutter_local_agent_kit 1.1.0
flutter_local_agent_kit: ^1.1.0 copied to clipboard
Offline-first AI framework for Flutter. Resident LLM inference, local RAG, and autonomous agents with a Material 3 Chat UI.
[Flutter Local Agent Kit Logo]
Flutter Local Agent Kit #
Flutter Local Agent Kit is the professional-grade toolkit for building high-performance, completely offline AI agents in Flutter. Orchestrate Local LLMs, Private RAG (Retrieval-Augmented Generation), and autonomous tool-calling loops with zero cloud dependency and total user privacy.
ποΈ Vision #
In the era of privacy-conscious software, the Flutter Local Agent Kit empowers developers to move AI inference from the cloud to the edge. By combining industry-leading local inference engines with a developer-first API, this package makes it trivial to build production-ready AI features that work anywhere, anytime, without API keys or recurring costs.
π Key Pillars #
- π Performance-First: Native C++ backends (via
llamadartandmobile_rag_engine) ensure desktop-class inference speeds on mobile hardware. - π‘οΈ Privacy-Centric: Data never leaves the device. Vector databases, model weights, and conversation history are all stored locally and encrypted if needed.
- π§ Autonomous Intelligence: Built-in ReAct agent loops allow your AI to not just "chat," but to "act" by calling Dart functions to interact with the device.
- π¨ UI Ready: Ship instantly with the premium
AgentChatView, a Material 3 component with markdown, streaming, and suggestion support.
π Table of Contents #
π Quick Start #
1. Add Dependency #
dependencies:
flutter_local_agent_kit: ^1.1.0
2. Initialize the Engine #
final kit = FlutterLocalAgentKit();
await kit.initialize(
modelPath: '/path/to/llama-3.2-1b.gguf',
template: Llama3Template(), // Or GemmaTemplate(), MistralTemplate(), etc.
);
3. Build the UI #
AgentChatView(
onMessage: (query) => kit.runAgent(query),
)
π Core Concepts #
LLM Engine #
The heart of the kit is a highly optimized GGUF inference engine. It supports 4-bit and 8-bit quantized models, allowing flagship-level performance (45+ tokens/sec) on modern mobile NPUs/GPUs.
RAG (Retrieval-Augmented Generation) #
Connect the AI to your own private data. The RAG service indexes local files (.pdf, .txt, .json) into a high-performance vector database, providing the AI with "hidden context" before it answers.
ReAct Agents #
The kit implements the Reason + Act (ReAct) paradigm. When an agent receives a query, it follows a Thought -> Action -> Observation cycle, calling Dart "Tools" to retrieve real-time data from the device or OS.
π οΈ Usage Guide #
Basic Inference #
For simple request-response chat without autonomous agents:
kit.askStream(
"Hello, who are you?",
maxTokens: 500, // Optional limit
).listen((token) {
print(token);
});
RAG Ingestion #
Automatically parse and index complex files for local knowledge retrieval:
await kit.ingestFile('/path/to/document.pdf');
await kit.ingestFile('/path/to/data.json');
// Subsequent queries will now search these files for context automatically.
Autonomous Agents #
Enable the AI to use local device capabilities through typed Tools:
class GpsTool extends BaseTool {
@override
String get name => 'get_location';
@override
String get description => 'Retrieves current GPS coordinates.';
@override
Future<String> execute(Map<String, dynamic> input) async {
return "Lat: 40.7128, Long: -74.0060";
}
}
kit.runAgent(
"Where am I right now?",
customTools: [GpsTool()],
maxTokens: 1024,
);
Session Persistence #
Save and load conversations with a single call to maintain state across app restarts:
// Load previous history
final history = await kit.loadSession('user_chat_01');
// Display in UI and auto-save
AgentChatView(
initialHistory: history,
onHistoryChanged: (history) => kit.saveSession('user_chat_01', history),
)
π§ Model Management #
The ModelManager provides enterprise-grade tools for handling heavy model weights:
- Checksum Verification: Ensure GGUF files are bit-perfect using SHA-256.
- Background Downloads: Integrated
Diosupport with progress tracking. - Clean Storage: Automated management of the
/modelsdirectory.
final definition = kit.models.recommendedModels.first;
await kit.models.downloadModel(
definition,
onProgress: (p) => print('Download: ${p * 100}%'),
);
π± Platform Support #
| Feature | Android | iOS | macOS | Windows | Linux |
|---|---|---|---|---|---|
| LLM Inference | β | β | β | β | β |
| RAG Retrieval | β | β | π§ | π§ | π§ |
| Autonomous Agents | β | β | β | β | β |
| Material 3 UI | β | β | β | β | β |
Note
RAG support for Desktop platforms is currently in development and will be released in an upcoming patch.
πΊοΈ Roadmap #
- β Native CoreML / NNAPI inference acceleration
- β Multi-agent orchestration (AgentOrchestrator)
- β Persistence & Session Management
- β RAG File Parsing (.pdf, .json, .txt)
- [/] Multimodal support (Foundations laid, UI started)
- β Native PDF rendering in ChatView
π License & Team #
Built with β€οΈ by the Flutter community. Licensed under MIT.
This package is a community-driven initiative inspired by Google's commitment to high-performance edge AI.