vocall_sdk 0.1.0
vocall_sdk: ^0.1.0 copied to clipboard
Flutter SDK for the Vocall Agent-Application Protocol (AAP). Let an AI agent see and control your app's UI — navigate screens, fill forms, click buttons, and interact via voice or text.
Jarvis SDK for Flutter #
Flutter SDK for the Jarvis Agent-Application Protocol (AAP) — let an AI agent see and control your app's UI through voice or text.
Unlike traditional chatbots that only answer questions, Jarvis understands your app's screens, form fields, and actions. It can navigate between pages, fill forms, click buttons, open modals, and talk to your users — all driven by natural language.
What You Can Build #
- Voice-controlled forms — "Fill the name with John Smith and set the email to john@example.com"
- Hands-free workflows — wake word detection, continuous listening, spoken responses
- AI-powered navigation — "Go to the settings page and enable notifications"
- Guided data entry — the agent walks users through multi-step forms via text or voice
Features #
- WebSocket client — automatic reconnection, dual-channel (commands + voice)
- Voice input — always-listening (wake word) and click-to-talk modes
- UI automation — agent-driven navigation, form filling, button clicks, modals, toasts
- Streaming chat — real-time token streaming with paginated message history
- Pre-built widgets — floating chat panel, voice visualizer, confirmation dialogs
- Field registry — declarative binding between Flutter
TextEditingControllers and the manifest - Platform-aware — Web Audio API on web, stub on other platforms
Requirements #
- Dart SDK
^3.10.1 - Flutter SDK
- A Jarvis Server URL and API key (get your key)
Installation #
dependencies:
jarvis_sdk: ^0.1.0
flutter pub get
Quick Start #
1. Create the client #
import 'package:jarvis_sdk/jarvis_sdk.dart';
final jarvis = JarvisClient(
serverUrl: 'wss://api.jarvis.primoia.com.br/connect',
token: 'your-api-key',
);
Get your API key at jarvis.primoia.com.br.
2. Describe your app #
The manifest tells Jarvis what screens, fields, and actions your app has:
final manifest = ManifestBuilder('my-app')
.version('1.0.0')
.currentScreen('contacts')
.user(UserInfo(name: 'Jane Doe', email: 'jane@example.com'))
.persona(Persona(
name: 'Alex',
role: 'assistant',
instructions: 'Help users manage their contacts. Be concise.',
))
.screen(ScreenDescriptor(
id: 'contacts',
label: 'Contacts',
fields: [
FieldDescriptor(id: 'name', type: FieldType.text, label: 'Full Name', required_: true),
FieldDescriptor(id: 'email', type: FieldType.email, label: 'Email'),
FieldDescriptor(id: 'phone', type: FieldType.phone, label: 'Phone'),
],
actions: [
ActionDescriptor(id: 'save', label: 'Save Contact', requiresConfirmation: true),
],
))
.build();
3. Register your form fields #
Bind your TextEditingControllers so the agent can fill them:
jarvis.fieldRegistry.registerField('contacts', 'name', nameController);
jarvis.fieldRegistry.registerField('contacts', 'email', emailController);
jarvis.fieldRegistry.registerField('contacts', 'phone', phoneController);
jarvis.fieldRegistry.registerAction('contacts', 'save', () => saveContact());
jarvis.fieldRegistry.onNavigate = (screenId) => router.go('/$screenId');
jarvis.connect(manifest);
4. Add the overlay #
@override
Widget build(BuildContext context) {
return JarvisOverlay(
client: jarvis,
child: Scaffold(body: MyContactsPage()),
);
}
That's it. Users can now type "Add John Smith, john@smith.com, phone 555-1234" or use voice, and the agent fills the form automatically.
Architecture #
┌──────────────────────────────────────────────────────────┐
│ Your Flutter App │
│ │
│ ┌──────────────┐ ┌───────────────┐ ┌──────────────┐ │
│ │JarvisOverlay │ │JarvisControll- │ │ Your Pages │ │
│ │ (FAB + Chat) │ │ able mixin │ │ (fields, │ │
│ └──────┬───────┘ └──────┬────────┘ │ buttons) │ │
│ │ │ └──────┬───────┘ │
│ │ ┌──────┴──────┐ │ │
│ │ │FieldRegistry│◄──────────┘ │
│ │ │ (fields + │ registers │
│ │ │ actions) │ controllers │
│ │ └──────┬──────┘ │
│ ▼ ▼ │
│ ┌───────────────────────────────────┐ │
│ │ JarvisClient │ │
│ │ ┌────────────┐ ┌────────────┐ │ │
│ │ │ /connect │ │ /ws/stream │ │ │
│ │ │ commands │ │ voice │ │ │
│ │ └─────┬──────┘ └─────┬──────┘ │ │
│ └─────────┼──────────────┼──────────┘ │
└────────────┼──────────────┼──────────────────────────────┘
│ │
▼ ▼
┌───────────────────────────────┐
│ Jarvis Server (SaaS) │
└───────────────────────────────┘
Documentation #
| Document | Description |
|---|---|
| Getting Started | Authentication, setup, first integration |
| Integration Guide | Field registration patterns, navigation, voice modes, advanced usage |
| API Reference | All classes, methods, properties, enums, and types |
| Protocol Reference | WebSocket message format, events, voice stream contract |
| Error Codes | Error codes and troubleshooting |
| Changelog | Version history |
Voice Modes #
Always-Listening #
Streams audio continuously. The server detects a wake word, transcribes speech, generates a response, and speaks it back — hands-free.
await jarvis.startAlwaysListening();
// Listening → Wake word → Recording → Response → Speaking → Listening (loop)
jarvis.stopAlwaysListening();
Click-to-Talk #
Press to record, release to send. No wake word needed.
await jarvis.startRecording(); // Start capturing audio
jarvis.stopRecording(); // Finalize and get response
Interrupt #
Stop the agent mid-response:
jarvis.interrupt();
UI Commands #
The agent can send these commands to control your app:
| Command | What it does |
|---|---|
navigate |
Switch to a screen |
fill |
Set a field value (with typewriter animation) |
clear |
Clear a field |
click |
Trigger a registered action |
focus |
Focus a field |
highlight |
Highlight a field |
ask_confirm |
Show a confirmation dialog |
show_toast |
Display a notification |
open_modal |
Open a modal |
close_modal |
Close the current modal |
Client States #
disconnected ─── connect ───► idle
│
┌────────────────┼──────────────┐
▼ ▼ ▼
listening recording thinking
(wake word) (push-to-talk) (processing)
│ │ │
▼ ▼ ▼
recording thinking speaking
│ │
▼ ▼
thinking executing
│ │
▼ ▼
speaking idle
│
▼
listening (loop)
Platform Support #
| Platform | Chat | Voice | Status |
|---|---|---|---|
| Flutter Web | Yes | Yes | Stable |
| Android | Yes | Planned | Chat only |
| iOS | Yes | Planned | Chat only |
| macOS / Windows / Linux | Yes | Planned | Chat only |
Voice capture requires Web Audio API (getUserMedia), currently available on web only. On other platforms, voiceSupported returns false and voice methods are no-ops. Chat and UI automation work on all platforms.
License #
Proprietary. Copyright PrimoIA. All rights reserved.
See your license agreement for usage terms.