memlocal_dart

local AI memory for Flutter apps. give your LLM a brain that remembers across sessions — facts, preferences, relationships, and context — all stored on-device with CozoDB.

pub package License: MIT

website · GitHub · example app · API docs


what is memlocal?

memlocal is a local-first AI memory layer modelled on human cognitive architecture. it gives any Flutter + LLM app the ability to:

  • remember facts, preferences, experiences, and relationships across sessions
  • search memories with vector similarity, full-text BM25, graph traversal, or hybrid
  • inject context automatically into LLM prompts
  • extract memories from conversations using LLM-driven inference
  • build a knowledge graph that connects related memories

all data stays on-device in an embedded CozoDB database — no cloud sync required.

memory architecture

memlocal mirrors the three-tier human memory model:

tier types TTL
sensory sensory buffer ~5 seconds
short-term working memory, attention context, conversation buffer ~5 minutes
long-term episodic, semantic, factual, procedural, social, spatial, prospective, affective permanent

12 memory subtypes are automatically classified during extraction so you can filter, weight, and reason over different categories of knowledge.

features

  • LLM-driven extraction — automatically parse facts, preferences, events, and relationships from conversations
  • multi-modal search — semantic (HNSW vector), full-text (BM25), graph traversal, and hybrid search with re-ranking
  • knowledge graph — 10 relationship types (relatesTo, contradicts, supersedes, causedBy, partOf, prefersOver, follows, instanceOf, belongsTo, similarTo)
  • user profiles — automatically maintained from extracted memories (static facts + dynamic context)
  • prospective memory — reminders triggered by topic, time, user presence, or semantic match
  • context builder — assembles a ready-to-inject context block from relevant memories, profile, and attention items
  • tool calling — 9 native tools formatted for Claude, OpenAI, and Gemini APIs
  • deduplication — content-hash dedup with LLM-driven update/merge for conflicting facts
  • on-device — everything stored locally in CozoDB with SQLite persistence
  • multi-provider — works with Anthropic, OpenAI, and Google embedding + LLM APIs

supported providers

LLMs (for extraction)

provider models
Anthropic Claude Sonnet 4.6, Claude Haiku 4.5
OpenAI GPT-5.2, GPT-5 Mini
Google Gemini 3.1 Pro, Gemini 3 Flash
custom any OpenAI-compatible endpoint
provider model dimensions
OpenAI text-embedding-3-small 1536
Google gemini-embedding-001 768
custom any compatible endpoint configurable

note: Anthropic does not offer an embedding API. when using Anthropic LLMs, memlocal defaults to OpenAI embeddings — provide an OpenAI API key via embeddingApiKey.

getting started

1. install

dependencies:
  memlocal_dart: ^1.0.0
flutter pub get

2. initialize

import 'package:memlocal_dart/memlocal_dart.dart';

final mem = await memlocal.init(
  memlocalConfig.withDefaults(
    provider: LlmProvider.claudeSonnet4_6,
    apiKey: 'sk-ant-...',
    embeddingApiKey: 'sk-openai-...', // Required for Anthropic
  ),
);

3. add memories

// automatic extraction — LLM parses facts from text
final result = await mem.add(
  'My name is Sirsho, I live in Bengaluru, and I love hiking.',
  userId: 'sirsho_01',
);
// result.items → [FactualMemory("name: Sirsho"), SpatialMemory("lives in Bengaluru"), ...]

// direct storage — no LLM inference
await mem.add(
  'Prefers dark mode',
  userId: 'sirsho_01',
  type: MemoryType.factual,
  infer: false,
);
// semantic search (vector similarity)
final results = await mem.search(
  'Where does the user live?',
  mode: SearchMode.semantic,
  userId: 'sirsho_01',
);

// full-text search (BM25)
final textResults = await mem.search(
  'Bengaluru',
  mode: SearchMode.text,
);

// hybrid (semantic + BM25, merged & re-ranked)
final hybrid = await mem.search(
  'outdoor hobbies near Bengaluru',
  mode: SearchMode.hybrid,
  userId: 'sirsho_01',
  k: 5,
);

5. context injection

the fastest way to make your LLM context-aware:

// build a context block for the current query
final context = await mem.getContext(
  'What should we do this weekend?',
  userId: 'sirsho_01',
);

// inject into your system prompt
final systemPrompt = '''You are a helpful assistant.

${context.contextBlock}''';
// contextBlock contains relevant memories grouped by type, user profile, and attention items

6. conversation buffer

track the conversation for extraction:

await mem.addMessage(Message.user('I just ran a 10k in Forest Park!'));
await mem.addMessage(Message.assistant('That sounds great! How was the trail?'));

// retrieve recent messages
final messages = await mem.getConversation(limit: 20);

7. knowledge graph

build relationships between memories:

await mem.addEdge(MemoryEdge(
  fromId: runningMemoryId,
  toId: BengaluruMemoryId,
  relation: MemoryRelation.relatesTo,
  weight: 0.9,
));

8. tool calling

expose memory tools to your LLM for agentic flows:

// get tool definitions in provider-native format
final tools = mem.toolDefinitions;
// pass these to your LLM API call as the `tools` parameter

// when the LLM returns tool calls, execute them:
final results = await mem.executeTools(response.toolCalls);

available tools: add_memory, search_memory, get_memories, delete_memory, get_user_profile, add_relationship, get_relationships, add_reminder, get_context

configuration

full configuration

final config = memlocalConfig(
  llm: LlmConfig(
    provider: LlmProvider.claudeSonnet4_6,
    apiKey: 'sk-ant-...',
    temperature: 0.1,          // low temp for reliable extraction
    maxTokens: 1024,
    modelOverride: null,       // use provider default
    baseUrlOverride: null,     // use provider default
  ),
  embedding: EmbeddingConfig(
    provider: EmbeddingProvider.openAi,
    apiKey: 'sk-openai-...',
    modelOverride: null,       // text-embedding-3-small
    dimensionsOverride: null,  // 1536
  ),
  storage: StorageConfig(
    inMemory: false,           // SQLite persistence
    dbPath: null,              // auto via path_provider
    hnswM: 16,                 // HNSW graph connectivity
    hnswEfConstruction: 100,   // HNSW construction quality
  ),
  conversationBufferSize: 20,  // keep last 20 messages
  sensoryBufferCapacity: 100,
  sensoryTtl: Duration(seconds: 5),
  customExtractionPrompt: null,  // override the extraction prompt
  customUpdatePrompt: null,      // override the update/merge prompt
);

final mem = await memlocal.init(config);
// auto-configures embedding provider based on LLM choice
final mem = await memlocal.init(
  memlocalConfig.withDefaults(
    provider: LlmProvider.claudeSonnet4_6,
    apiKey: 'sk-ant-...',
    embeddingApiKey: 'sk-openai-...',
    storage: StorageConfig(inMemory: false),
  ),
);

how extraction works

when you call mem.add(content, infer: true):

  1. LLM extraction — the content is sent to the configured LLM with a structured prompt that asks it to identify facts, preferences, events, and relationships
  2. candidate parsing — the LLM's JSON response is parsed into candidate MemoryItems with appropriate MemoryType classification
  3. deduplication — each candidate is checked against existing memories by content hash. duplicates are skipped; conflicts are sent back to the LLM for reconciliation
  4. embedding — new/updated items are batch-embedded via the embedding API
  5. storage — items are upserted into CozoDB with their vectors, and HNSW/FTS/LSH indices are automatically maintained

the entire pipeline runs locally except for the LLM and embedding API calls.

advanced usage

access the memory manager

for fine-grained control over individual memory subsystems:

final manager = mem.manager;

// direct access to subsystems
await manager.semanticMemory.record(item, embedding);
await manager.episodicMemory.record(item, embedding);
await manager.factualMemory.record(item, embedding);
await manager.socialMemory.record(item, embedding);

// working memory control
manager.workingMemory.setRelevant(items);
manager.workingMemory.focus(importantItem);
final block = manager.workingMemory.toContextBlock();

// storage layer
final store = manager.store;
await store.putEdge(edge);
final ranks = await store.pageRank();
final communities = await store.communityDetection();

user profiles

// profiles are auto-maintained from extracted memories
final profile = await mem.getProfile('sirsho_01');
print(profile?.staticFacts);  // {name: Sirsho, location: Bengaluru, ...}
print(profile?.dynamicContext); // {recent_topic: trail running, ...}
print(profile?.toSummary());  // Formatted text for LLM context

// manual update
await mem.updateProfile(profile!.copyWith(
  staticFacts: {...profile.staticFacts, 'occupation': 'Engineer'},
));

prospective memory (reminders)

// create a reminder triggered by topic mention
await mem.manager.store.putProspective(ProspectiveItem(
  id: 'reminder_1',
  content: 'Ask about the Bengaluru marathon',
  triggerType: TriggerType.topicMention,
  triggerCondition: 'marathon OR running race',
  userId: 'sirsho_01',
));

// check pending reminders
final pending = await mem.manager.store.getPendingProspective(userId: 'sirsho_01');

API reference

memlocal (main facade)

method description
memlocal.init(config) initialize the memory system
add(content, ...) add memory (with optional LLM inference)
addMessage(message) add to conversation buffer
addEdge(edge) add a knowledge graph edge
search(query, ...) search memories (semantic / text / graph / hybrid)
getContext(query, ...) build assembled context for LLM injection
getProfile(userId) get auto-maintained user profile
updateProfile(profile) update user profile
getConversation(...) get recent conversation messages
clearConversation(...) clear conversation buffer
toolDefinitions get LLM tool definitions (provider-native format)
executeTools(toolCalls) execute LLM tool calls
memoryCount total stored memories
dispose() clean up resources

memory types

type category description
episodic long-term personal experiences and events
semantic long-term general knowledge and concepts
factual long-term personal facts, preferences, profile data
procedural long-term how-to knowledge, workflows
social long-term relationships and social context
spatial long-term locations, places, spatial context
prospective long-term future intentions, reminders, goals
affective long-term emotions, sentiments, moods
workingMemory short-term active context for current query
attentionContext short-term explicitly focused items
conversationBuffer short-term recent conversation messages
sensoryBuffer sensory raw incoming signals (~5s TTL)

search modes

mode strategy
SearchMode.semantic HNSW vector similarity search
SearchMode.text BM25 full-text search
SearchMode.graph graph traversal from nearest match
SearchMode.hybrid all modes merged and re-ranked

LLM tools

tool description
add_memory store a fact, preference, or experience
search_memory search memories by query
get_memories list memories with optional filters
delete_memory delete a memory by ID
get_user_profile get user profile summary
add_relationship create a knowledge graph edge
get_relationships get edges connected to a memory
add_reminder create a prospective memory / reminder
get_context get assembled memory context

example app

the example/ directory contains a full-featured chat app demonstrating:

  • chat with memory-augmented LLM responses
  • inline memory event timeline (extraction, search, context building)
  • memory browser with semantic/text/hybrid search
  • settings page with provider configuration
  • .env-based configuration
cd example
cp .env.example .env
# Edit .env with your API keys
flutter run

platform support

platform status
iOS
Android
macOS
Linux
Windows

CozoDB's Rust core is compiled via flutter_rust_bridge through the cozo_dart package.

requirements

  • Flutter >=3.3.0
  • Dart SDK ^3.9.2
  • an API key from Anthropic, OpenAI, or Google (for LLM extraction + embeddings)

license

MIT — see LICENSE for details.

Libraries

memlocal_dart
An AI memory layer for Flutter, powered by CozoDB.