in_app_mcp - Dart API docs

A policy-gated tool runtime for in-app LLM/agent tools. Per-tool auto / confirm / deny gate, ephemeral grants, preview, audit, undo — and a structured user-input primitive for handlers that need the user to fill in missing arguments mid-call. Provider-neutral; usable with any LLM.

ToolCall ──► preview ──► policy gate ──► validate ──► handler ──► audit ledger
               │            │                            │  │          │
               │            auto / confirm / deny        │  │          └── undo hook
               │            + ephemeral grants           │  │
               │                                         │  └── pendingInputs → render form
               │                                         │     → merge args → re-invoke
               │                                         └── structured ToolResult
               └── pure, no side effect

Not the MCP wire protocol. Despite the name, this package does not speak JSON-RPC / stdio / SSE — see mcp_server / mcp_client for that. in_app_mcp is a local, in-process runtime focused on the authorization boundary between a model's tool call and your app's side effects.

Install

dependencies:
  in_app_mcp: ^1.3.0

Quick start

import 'package:in_app_mcp/in_app_mcp.dart';

final mcp = InAppMcp(defaultPolicy: ToolPolicy.confirm);

mcp.registerTool(
  definition: const ToolDefinition(
    name: 'echo',
    description: 'Echo message back',
    argumentTypes: {'message': ToolArgType.string},
    requiredArguments: {'message'},
    allowAdditionalArguments: false,
  ),
  handler: (call) async =>
      ToolResult.ok('ok', data: {'echo': call.arguments['message']}),
);

final result = await mcp.handleToolCall(
  const ToolCall(id: '1', toolName: 'echo', arguments: {'message': 'hello'}),
  confirmed: true,
);

Runtime flow

Adapter produces ToolCall
Policy resolves for toolName (active EphemeralGrants are consumed)
InvocationInterceptor.onResolvePolicy can override
Deny → policy_denied; confirm-required without confirmed: true → confirmation_required
beforeExecute can veto (first-wins)
Registry validates arguments (types, requireds)
Handler runs and returns ToolResult — including ToolResult.requiresInput(...) to pause for user input
afterExecute can rewrite
AuditLedger records; onAudit fans out

Structured user input

Pause mid-call and ask the user for missing arguments without a side channel. Declare the optional args as non-required, then:

Future<ToolResult> execute(ToolCall call) async {
  if (call.arguments['destination'] == null) {
    return ToolResult.requiresInput(
      requests: const [
        UserInputRequest(
          id: 'destination',
          kind: 'text', // 'text' | 'single_choice' | 'photos' | 'location' | custom
          field: 'destination',
          prompt: 'Where are you travelling to?',
          parameters: {'placeholder': 'e.g. Tokyo'},
        ),
      ],
    );
  }
  return ToolResult.ok('Booked ${call.arguments['destination']}.');
}

The host reads result.pendingInputs, renders a widget per kind (apps register their own; the example ships text / single_choice / number), collects values keyed by field, merges them into ToolCall.arguments, and calls handleToolCall again. kind is free-form on purpose — add domain-specific kinds ('certificate', 'location_autocomplete') as needed. Pending-input rounds flow through afterExecute and into the audit ledger like any other outcome.

See example/lib/agent_tools/book_trip_tool.dart for a four-field round-trip.

Interceptors

Plug into the pipeline without implementing a whole PolicyStore / GrantStore / AuditLedger:

class RateLimiter extends InvocationInterceptor {
  final Map<String, DateTime> _lastCall = {};
  @override
  Future<ToolResult?> beforeExecute(ToolCall call, ResolvedPolicy _) async {
    final now = DateTime.now();
    final last = _lastCall[call.toolName];
    _lastCall[call.toolName] = now;
    if (last != null && now.difference(last).inSeconds < 5) {
      return ToolResult.fail('rate_limited', 'Try again in 5 seconds.');
    }
    return null;
  }
}

final mcp = InAppMcp(interceptors: [RateLimiter()]);

Hook	Fires when	Return non-null to…
`onResolvePolicy`	after `PolicyEngine` decides	override the decision (chain-through)
`beforeExecute`	after policy allows, before handler	veto with a failure (first-wins)
`afterExecute`	after handler returns	rewrite result — e.g. redact PII (chain-through)
`onAudit`	after ledger records	fan-out telemetry; exceptions swallowed

onResolvePolicy / beforeExecute / afterExecute propagate exceptions and fail the call. Only onAudit swallows errors.

Showcase

All cards below are real frames from example/integration_test/ driving Gemma 4 E2B on-device on an iPhone simulator. Prompts are natural language — no tool names, no schemas.

Tool-call proposals

_{"Wake me up at 6 AM every weekday."}

_{"Put a Team Sync meeting on my calendar tomorrow 10–11 AM at Main Office."}

_{"How do I drive to Tokyo?"}

_{"Draft an email to team@example.com saying hello."}

_{"Echo 'hello from showcase'." (codegen `@McpTool`)}

Each card shows the tool icon, description, status chip, policy chip, and proposed arguments. Nothing executes until the user taps Run.

Why the policy gate matters: Gemma sometimes fills placeholders like startIso: "<tomorrow's date>T10:00:00" instead of a resolved timestamp. The card surfaces proposed arguments before the handler runs, so the user can catch it.

Driving "Echo back the phrase 'hello from showcase'" through the four lifecycle layers:

_{Preview — `@McpToolPreview` summarises before Run.}

_{Grant submenu — once / 5 min / session.}

_{Succeeded + Undo — ledger entry attaches and reveals the Undo button.}

_{Undone — `@McpToolUndo` runs, ledger entry marked undone.}

_{Audit timeline — every outcome listed with per-entry undo.}

On-device Gemma (iOS simulator)

cd example
./scripts/precache_gemma_e2b.sh         # one-time model cache

flutter run -d <booted-simulator-id> \
  --dart-define=LLM_ADAPTER=gemma \
  --dart-define=GEMMA_MODEL_PATH=$PWD/model_cache/gemma-4-E2B-it.litertlm

Integration tests under example/integration_test/ (screenshot-producing showcases + gemma_book_trip_flow_test.dart for the pending-input flow) use the same --dart-defines; capture_consent_showcase.sh also drives xcrun simctl io booted screenshot. The deterministic book-trip flow runs without a real model via --dart-define=E2E_MODE=true --dart-define=GEMMA_MODEL_PATH=fake.

Public API

Models: ToolCall, ToolDefinition + ToolArgType, ToolResult (incl. pendingInputs, requiresUserInput), ToolErrorCode, Preview + PreviewWarning, UserInputRequest.

Runtime: InAppMcp (facade), ToolPolicy / PolicyDecision / PolicySource / ResolvedPolicy, PolicyStore + InMemoryPolicyStore, GrantStore + InMemoryGrantStore + EphemeralGrant, AuditLedger + InMemoryAuditLedger + AuditEntry, ToolRegistry + RegisteredTool + ToolPreviewer + ToolUndoer, InvocationEngine + InvocationInterceptor.

Error codes: tool_not_found, invalid_arguments, policy_denied, confirmation_required, requires_user_input, audit_disabled, entry_not_found, already_undone, nothing_to_undo, undo_not_supported.

Testing

# package
flutter analyze && flutter test

# example app
cd example && flutter analyze && flutter test

Security notes

Don't hardcode API keys. Treat tool handlers as side-effect boundaries.
Keep risky tools behind confirm or deny.
OS-level permissions still apply where the platform requires them.

Install

Quick start

Runtime flow

Structured user input

Interceptors

Showcase

Tool-call proposals

On-device Gemma (iOS simulator)

Public API

Testing

Security notes

Docs

Libraries

in_app_mcp package

Install

Quick start

Runtime flow

Structured user input

Interceptors

Showcase

Tool-call proposals

Consent Lifecycle (single prompt, end-to-end)

On-device Gemma (iOS simulator)

Public API

Testing

Security notes

Docs

Libraries

in_app_mcp package