🥔

Mr. Potato

The human-in-the-loop supervision layer for autonomous AI agents.

Your agents run on their own. Mr. Potato sits on your desktop, watches what they do, and raises his hand at high-stakes moments to ask for your approval before the action proceeds.

MEGATHON · Agents track HFSP Labs · info@hfsp.xyz

Speaker

Bar-test one-liner: "It's the approve / deny button for your AI agents." Hold the title 5 seconds, then jump straight to the problem. Don't explain crypto yet — no one cares about chains until they care about the problem.

The problem

Agents are getting autonomy
faster than we're getting control.

Deploys

An agent pushes to main or ships to prod — no one signed off.

Spend

An agent pays for an API or moves funds with no budget gate.

Irreversible

Deletes, migrations, on-chain txns you can't take back.

Logs and dashboards are after-the-fact. By the time you read them, the action already happened.

Speaker

Make it visceral with a real story (Devin pushing a branch to main). The gap isn't observability — it's a decision point. Objection prep: "Isn't this just a CI approval gate?" → No: it's transport-agnostic, real-time, and routes the human decision straight back into the agent's loop regardless of where the agent runs.

Why now

Two curves just crossed.

Agents went autonomous

Devin and peers now open PRs, deploy, and act with minimal oversight. Autonomy is the product.

Agents can now pay

x402 (HTTP 402) gives agents a native way to pay per-call in USDC — no human co-signer in the request path.

More autonomy + spending power = the human approval gate is now the missing safety primitive.

Speaker

The "why now" is the crossing of these two curves in 2025–26, not "crypto is growing." Push hard on: the moment agents can both act and pay unsupervised, you need a gate. Market figures are directional — present them as the trend, not precise stats.

The solution

A thin layer of human judgment,
in the agent's loop.

01

Agents run

Local IDE agents, remote LLMs, HTTP services, earn workers — all report in over agent transports.

→

02

Mr. Potato gates

At a high-stakes action it surfaces a confirm notification. The mascot raises his hand.

→

03

You decide

Approve and the decision routes straight back to the agent so it continues. Deny and it stops.

confirm payload → { message, confirmLabel, cancelLabel } ⇒ answer → { confirmed: boolean }

Speaker

Emphasize "in the loop, not after it." The confirm kind is fully wired end-to-end today (transport → notification → human → agent). This is the one slide that defines the category: AgentOps / a supervision layer.

Demo · it's real

A desktop companion that asks before it acts.

Idle — watching

Hand raised — needs you

Resolved — done

Speaker

Switch to the live app here if possible — judges reward a working demo. The widget is transparent, always-on-top, pixel-art. State machine: idle → hand-raised (pending approval) → done.

The flagship flow

Devin acts · Cala verifies · x402 meters · you approve.

Human approval gate — "Deploy to production?"

x402 — authorize 0.05 USDC for verified data

Devin (the engine) prepares a change → Cala returns a sourced, verified fact → x402 settles the data call in USDC → Mr. Potato pauses for your approve / deny before anything ships.

Speaker

This is the centerpiece — the Devin-track story. Devin is the autonomous engine; Mr. Potato is the human gate; Cala enriches the decision with a verified fact; x402 is just the billing rail. Run examples/flagship-demo.js live if time allows.

Try it live · in this page

Click a scenario — watch Mr. Potato raise his hand.

interactive

Speaker

This is a real, clickable simulation of the supervision loop running in the browser (the production app is Electron). Click "Devin finished a task" → Approve to show the confirm → resolve → decision-history flow; then "Cala query (x402)" for the micro-payment authorize. Clicks inside the demo don't advance the deck.

Why crypto — the necessity test

Rip out the chain. What breaks?

Agent-to-agent payments — agents pay per-call with no bank account, no human co-signer (x402 / USDC).
Trustless settlement — pay-then-prove: the server verifies the on-chain tx and rejects double-spends.
Permissionless metering — any data provider (Cala) can charge any agent, instantly, globally.

Without the chain, autonomous agents can't transact with each other — they fall back to humans and API keys. The payment rail is the part that can't be faked.

Speaker

Honest framing: crypto is a capability, not the headline. The supervision layer is the product; x402 is what makes agent-to-agent commerce actually work. Our proxy does real on-chain verify + double-spend protection (pay-then-prove), not a trust-me header.

Traction · what's real today

Not a mockup. A working app.

100%

confirm loop wired end-to-end

119

tests passing (node --test)

3

agent adapters merged: Devin · Cala · x402

5

transports: inbox · HTTP · bridge · earn · x402

Electron app (widget + dashboard) · WebSocket bridge for remote agents · on-chain x402 verify on Solana devnet · MCP + VSCode + CLI surfaces. Users / pilots: TBD.

Speaker

Be blunt: no external users yet — the proof is a complete, tested, working system, not vanity metrics. If asked about traction, pivot to "the product exists and the hard part (the wired human-in-loop + real on-chain settlement) is done." Cheapest next experiment: 5 design partners running their agents through the gate.

Under the hood

Transport-agnostic by design.

Agent connections — Devin adapter card

Cala — verified facts, x402-metered

Any agent — local, remote, or service — plugs in via an adapter and emits a pending action. One confirm contract; infinite agent kinds.

Speaker

The moat is the abstraction: a single human-decision primitive that any agent transport can call. Adding a new agent = one adapter subclass. This is what makes it a platform, not a feature.

Market — cited, as of Jun 2026

A big wave, and a missing gate.

~$8B

AI-agents market (2025), ~45% CAGR → ~$50B by 2030

100M+

x402 agentic payments on Base (through Q1 2026)

$33T

stablecoin settlement in 2025, +72% YoY

$75B

USDC market cap — the agent currency

x402 is now a Linux Foundation project backed by Coinbase, Google, Stripe, AWS, Visa, Circle & the Solana Foundation. Wedge: developers running coding agents today → any team deploying agents that spend or ship.

Sources: MarketsandMarkets · BCC · Chainalysis · Artemis · CoinMarketCap · DefiLlama · Linux Foundation (full citations in research/MARKET.md)

Speaker

Every figure is cross-referenced against ≥2 sources (see presentation/research/MARKET.md), stamped as-of 2026-06-21 — re-pull before presenting. The story: agent adoption + a real agent-payment rail (x402, now Linux Foundation) + massive stablecoin settlement = the human approval gate is the missing primitive. TAM/SAM/SOM arithmetic is in the research doc.

The ask

Back the supervision layer
for the agent era.

MEGATHON judges — recognize Mr. Potato in the Agents track.
Design partners — teams running agents who want a human gate. 5 pilots.
Ecosystem — adopt the confirm + x402 pattern as the standard agent approval primitive.

Mr. Potato — agents do the work; humans stay in control. · info@hfsp.xyz · hfsp.xyz · github.com/lpsmurf/mr-potato

Speaker

Make the ask specific and close with the one-liner. If this becomes an investor deck, swap to: amount + instrument + use of funds + 6-month milestone (e.g. "$X SAFE to reach N design partners and ship the hosted gate").

Mr. Potato

Agents are getting autonomyfaster than we're getting control.

Two curves just crossed.

A thin layer of human judgment,in the agent's loop.

A desktop companion that asks before it acts.

Devin acts · Cala verifies · x402 meters · you approve.

Click a scenario — watch Mr. Potato raise his hand.

Rip out the chain. What breaks?

Not a mockup. A working app.

Transport-agnostic by design.

A big wave, and a missing gate.

Back the supervision layerfor the agent era.

Agents are getting autonomy
faster than we're getting control.

A thin layer of human judgment,
in the agent's loop.

Back the supervision layer
for the agent era.