fire_rag 1.1.2
fire_rag: ^1.1.2 copied to clipboard
Firebase RAG for AI Agents
fire_rag #
fire_rag is a Firebase-first ingestion pipeline for retrieval-augmented generation.
It is built for the case where you already have:
- a Firestore-backed application
- a Cloud Tasks queue
- a text source arriving in Cloud Storage
- an embedding model
- optionally, a chat model for recursive summarization
Instead of indexing everything in one request, fire_rag turns ingestion into resumable Cloud Tasks:
- download a source file
- split it into base chunks
- persist chunk records to Firestore
- embed those records
- optionally distill them into higher levels of detail
- embed the distilled records too
The result is a Firestore collection containing serialized agentic Chunk records, vectors, and chunk relationships that can later be queried with rag.
Concept #
This package is not a chat agent and it is not a vector query layer by itself.
It is the ingestion half of a Firebase RAG stack.
The core idea is:
- use
arcane_adminto run resumable task work in Cloud Tasks - use
agenticto chunk text and call models - store chunk documents in Firestore
- use
raglater to retrieve those embedded records during answering
If you already have documents landing in Cloud Storage, fire_rag gives you a straightforward path from uploaded file to vectorized Firestore records.
How It Works #
fire_rag wires three task types into an arcane_admin TaskManager.
1. TaskChunk #
TaskChunk is the entry task.
It:
- downloads a source file from Cloud Storage to a local temp path
- uses
agentic.IChunkerto split the file into baseChunkmodels - writes each chunk into Firestore using the
Chunkmodel shape - batches the chunk document IDs into
TaskEmbedjobs - optionally schedules
TaskDistillif recursive summarization is enabled
Each base chunk document stores the normal Chunk fields, including:
contentpostContentindexlodcharStartcharEndrecordmetadata
destinationMetadata from TaskChunk is merged into Chunk.metadata, not written as extra top-level Firestore fields.
2. TaskEmbed #
TaskEmbed reads stored chunk text from Firestore, reconstructs Chunk.fullContent, calls your connected embedding model, and writes the resulting vector back onto the same document.
By default it embeds:
content + postContent
and stores the result in:
vector
3. TaskDistill #
TaskDistill is the recursive summarization stage.
It:
- reads groups of
factorchunks from one level of detail - sends them to your connected chat model
- writes a distilled
Chunkinto the nextlod - links source and distilled chunks with top-level
downandup - schedules embedding for the newly created distilled chunks
- continues level-by-level until only one distilled output remains
This gives you a hierarchy of chunks:
- L0: original chunked source text
- L1: distilled groups of L0
- L2: distilled groups of L1
- and so on until a single higher-level summary remains
Package Surface #
The public bootstrap is small:
FireRag.init(...)
It registers task executors for:
TaskChunkTaskDistillTaskEmbed
and exposes the configured TaskManager, embedding model, and chat model through FireRag.instance.
Installation #
Add the package:
dart pub add fire_rag
Typical companion packages are:
dart pub add arcane_admin
dart pub add agentic
dart pub add rag
If you are developing this package or changing artifact-backed task models, keep generated code up to date:
dart run build_runner build --delete-conflicting-outputs
Getting Started #
At startup you usually do two things:
- initialize
ArcaneAdmin - initialize
FireRag
import 'package:agentic/agentic.dart';
import 'package:arcane_admin/arcane_admin.dart';
import 'package:fire_rag/fire_rag.dart';
Future<void> main() async {
await ArcaneAdmin.initialize(
projectId: 'my-project-id',
defaultStorageBucket: 'my-project-id.firebasestorage.app',
);
ConnectedEmbeddingModel embedder = OpenAIConnector(
apiKey: const String.fromEnvironment('OPENAI_API_KEY'),
).asEmbedder('text-embedding-3-small');
ConnectedChatModel llm = OpenAIConnector(
apiKey: const String.fromEnvironment('OPENAI_API_KEY'),
).connect(ChatModel.openai4_1Mini);
FireRag.init(
embed: embedder,
llm: llm,
taskQueue: 'rag-ingest',
endpointUrl: 'https://your-service.run.app/event/executeJob',
);
}
Usage #
The usual deployment shape is:
- one endpoint that receives a Cloud Storage finalization event
- one endpoint that executes scheduled tasks
Minimal Server Wiring #
import 'package:arcane_admin/arcane_admin.dart';
import 'package:fire_rag/fire_rag.dart';
import 'package:fire_rag/task/task_chunk.dart';
import 'package:shelf/shelf.dart';
import 'package:shelf/shelf_io.dart' as io;
import 'package:shelf_router/shelf_router.dart';
Future<void> main() async {
await ArcaneAdmin.initialize(
projectId: 'my-project-id',
defaultStorageBucket: 'my-project-id.firebasestorage.app',
);
FireRag.init(
embed: /* your ConnectedEmbeddingModel */,
llm: /* your ConnectedChatModel */,
taskQueue: 'rag-ingest',
endpointUrl: 'https://your-service.run.app/event/executeJob',
);
Router router = Router();
router.taskManager(FireRag.instance.taskManager);
router.post('/storageFinalized', (Request request) {
return request.storageEvent((ArcaneStorageEvent event) async {
if (!event.path.endsWith('.txt')) {
return Response.ok('');
}
await FireRag.instance.taskManager.schedule(
TaskChunk(
taskId: 'ingest.${event.bucket}.${event.path}',
sourceBucket: event.bucket,
sourcePath: event.path,
destinationCollection: 'rag_chunks',
record: event.path,
maxChunkSize: 500,
maxPostOverlap: 100,
embedBatchSize: 25,
chunkBatchSize: 100,
distillationFactor: 4,
destinationMetadata: {
'sourceBucket': event.bucket,
'sourcePath': event.path,
},
),
);
return Response.ok('');
});
});
await io.serve(router.call, '0.0.0.0', 8080);
}
What The Example Does #
/storageFinalizedreceives a storage event from Eventarc or your own webhook bridge- a new
TaskChunkis scheduled /event/executeJobis automatically handled byrouter.taskManager(...)- the task manager keeps re-queuing work until the current task is complete
Data Model #
Chunk document IDs follow this pattern:
{record}.{index}L{lod}
Examples:
customer-handbook.pdf.0L0
customer-handbook.pdf.1L0
customer-handbook.pdf.0L1
customer-handbook.pdf.0L2
Useful stored fields include:
content: the main body of the chunkpostContent: overlap from the following chunkindex: the chunk index within that levellod: level of detailcharStartcharEndrecord: logical record identifiermetadata: extra application metadata such as source bucket or source pathvector: embedding written byTaskEmbeddown: child chunk indexes used to create a distilled chunkup: parent chunk index created from a source chunk
In practice, Firestore documents are stored in the same shape as agentic's Chunk.toMap():
{
"index": 0,
"content": "Chunk body",
"postContent": " overlap from the next chunk",
"charStart": 0,
"charEnd": 532,
"lod": 0,
"record": "customer-handbook.txt",
"metadata": {
"sourceBucket": "docs",
"sourcePath": "customer-handbook.txt"
},
"up": 0,
"down": [0, 1, 2, 3],
"vector": {
"vector": [0.12, -0.04, 0.98]
}
}
Notes about that shape:
up,down, andvectorare optional and appear only after later stages populate themmetadatais the right place for custom application values- top-level chunk fields remain available for app-side orchestration and debugging
Choosing Distillation Settings #
The most important knobs are:
maxChunkSize: target size of each stored chunkmaxPostOverlap: overlap appended from the next chunkchunkBatchSize: how many chunks are persisted before scheduling embed workembedBatchSize: how many document IDs are sent per embedding taskdistillationFactor: how many chunks are combined into one higher-LOD chunk
Rules of thumb:
- start with
maxChunkSize: 500andmaxPostOverlap: 100 - use
distillationFactor: 4if you want a compact hierarchy - omit
distillationFactorif you only want base chunks plus embeddings - increase
embedBatchSizeonly if your embedding provider comfortably supports it
Relationship To Other Packages #
fire_rag is intentionally small because it leans on a few other packages:
-
agenticUsed forIChunker, chunk text splitting, connected chat models, and connected embedding models.TaskChunkuses the chunker.TaskDistillandTaskEmbeduse the connected models. -
arcane_adminUsed for Firebase admin initialization, Cloud Storage download access, Firestore access, Eventarc helpers, and the resumableTaskManager/TaskExecutorsystem that drives ingestion. -
ragNot used to ingest documents, but intended as the retrieval-side companion package once your Firestore chunk collection has vectors. The nestedmetadatamap written byfire_raglines up withrag's Firestore vector-space metadata convention. -
fire_apiUsed indirectly througharcane_adminfor Firestore and Storage abstractions, including document reads, writes, andVectorValue. -
artifactUsed for serializable task objects so task state can be preserved between Cloud Task executions.
You can think of the stack like this:
agentichandles model calls and chunkingarcane_adminhandles Firebase admin and task orchestrationfire_ragturns those pieces into a resumable ingestion pipelineragconsumes the resulting embedded records for retrieval
Typical Flow In Production #
- A text file is uploaded to Cloud Storage.
- A storage event schedules
TaskChunk. TaskChunkwrites L0 chunks and schedulesTaskEmbed.- If enabled,
TaskChunkschedulesTaskDistill. TaskDistillwrites L1 chunks, schedules embeds for those chunks, and recursively schedules higher levels.- Firestore ends up containing both raw and distilled chunk records plus vectors.
- Your retrieval layer queries that collection later with
rag.
Notes #
- This package currently assumes text-file ingestion. If your upstream source is PDF, OCR, HTML, or something else, convert it to text before scheduling
TaskChunk. TaskDistillrequires a chat model. If you do not want summarization, leavedistillationFactorunset.- Query-time schema expectations are up to your retrieval layer.
fire_ragfocuses on ingestion and vectorization, not retrieval policy.
Contributing #
If you change task models or artifact-backed state fields, regenerate code before publishing:
dart run build_runner build --delete-conflicting-outputs