๐ฅ
Mr. Potato
The human-in-the-loop supervision layer for autonomous AI agents.
Your agents run on their own. Mr. Potato sits on your desktop, watches what they do,
and raises his hand at high-stakes moments to ask for your approval before the action proceeds.
MEGATHON ยท Agents track
HFSP Labs ยท info@hfsp.xyz
Speaker
Bar-test one-liner: "It's the approve / deny button for your AI agents." Hold the title 5 seconds, then jump straight to the problem. Don't explain crypto yet โ no one cares about chains until they care about the problem.
The problem
Agents are getting autonomy
faster than we're getting control.
Deploys
An agent pushes to main or ships to prod โ no one signed off.
Spend
An agent pays for an API or moves funds with no budget gate.
Irreversible
Deletes, migrations, on-chain txns you can't take back.
Logs and dashboards are after-the-fact. By the time you read them, the action already happened.
Speaker
Make it visceral with a real story (Devin pushing a branch to main). The gap isn't observability โ it's a
decision point. Objection prep: "Isn't this just a CI approval gate?" โ No: it's transport-agnostic, real-time, and routes the human decision straight back into the agent's loop regardless of where the agent runs.
Why now
Two curves just crossed.
Agents went autonomous
Devin and peers now open PRs, deploy, and act with minimal oversight. Autonomy is the product.
Agents can now pay
x402 (HTTP 402) gives agents a native way to pay per-call in USDC โ no human co-signer in the request path.
More autonomy + spending power = the human approval gate is now the missing safety primitive.
Speaker
The "why now" is the crossing of these two curves in 2025โ26, not "crypto is growing." Push hard on: the moment agents can both
act and
pay unsupervised, you need a gate. Market figures are directional โ present them as the trend, not precise stats.
The solution
A thin layer of human judgment,
in the agent's loop.
01
Agents run
Local IDE agents, remote LLMs, HTTP services, earn workers โ all report in over agent transports.
โ
02
Mr. Potato gates
At a high-stakes action it surfaces a confirm notification. The mascot raises his hand.
โ
03
You decide
Approve and the decision routes straight back to the agent so it continues. Deny and it stops.
confirm payload โ { message, confirmLabel, cancelLabel } โ answer โ { confirmed: boolean }
Speaker
Emphasize "in the loop, not after it." The confirm kind is fully wired end-to-end today (transport โ notification โ human โ agent). This is the one slide that defines the category: AgentOps / a supervision layer.
Demo ยท it's real
A desktop companion that asks before it acts.

Idle โ watching

Hand raised โ needs you

Resolved โ done
Speaker
Switch to the live app here if possible โ judges reward a working demo. The widget is transparent, always-on-top, pixel-art. State machine: idle โ hand-raised (pending approval) โ done.
The flagship flow
Devin acts ยท Cala verifies ยท x402 meters ยท you approve.

Human approval gate โ "Deploy to production?"

x402 โ authorize 0.05 USDC for verified data
Devin (the engine) prepares a change โ Cala returns a sourced, verified fact โ x402 settles the data call in USDC โ Mr. Potato pauses for your approve / deny before anything ships.
Speaker
This is the centerpiece โ the Devin-track story. Devin is the autonomous engine; Mr. Potato is the human gate; Cala enriches the decision with a verified fact; x402 is just the billing rail. Run examples/flagship-demo.js live if time allows.
Try it live ยท in this page
Click a scenario โ watch Mr. Potato raise his hand.
interactive
Speaker
This is a real, clickable simulation of the supervision loop running in the browser (the production app is Electron). Click "Devin finished a task" โ Approve to show the confirm โ resolve โ decision-history flow; then "Cala query (x402)" for the micro-payment authorize. Clicks inside the demo don't advance the deck.
Why crypto โ the necessity test
Rip out the chain. What breaks?
- Agent-to-agent payments โ agents pay per-call with no bank account, no human co-signer (x402 / USDC).
- Trustless settlement โ pay-then-prove: the server verifies the on-chain tx and rejects double-spends.
- Permissionless metering โ any data provider (Cala) can charge any agent, instantly, globally.
Without the chain, autonomous agents can't transact with each other โ they fall back to humans and API keys. The payment rail is the part that can't be faked.
Speaker
Honest framing: crypto is a capability, not the headline. The supervision layer is the product; x402 is what makes agent-to-agent commerce actually work. Our proxy does real on-chain verify + double-spend protection (pay-then-prove), not a trust-me header.
Traction ยท what's real today
Not a mockup. A working app.
100%
confirm loop wired end-to-end
119
tests passing (node --test)
3
agent adapters merged: Devin ยท Cala ยท x402
5
transports: inbox ยท HTTP ยท bridge ยท earn ยท x402
Electron app (widget + dashboard) ยท WebSocket bridge for remote agents ยท on-chain x402 verify on Solana devnet ยท MCP + VSCode + CLI surfaces. Users / pilots: TBD.
Speaker
Be blunt: no external users yet โ the proof is a complete, tested, working system, not vanity metrics. If asked about traction, pivot to "the product exists and the hard part (the wired human-in-loop + real on-chain settlement) is done." Cheapest next experiment: 5 design partners running their agents through the gate.
Under the hood
Transport-agnostic by design.

Agent connections โ Devin adapter card

Cala โ verified facts, x402-metered
Any agent โ local, remote, or service โ plugs in via an adapter and emits a pending action. One confirm contract; infinite agent kinds.
Speaker
The moat is the abstraction: a single human-decision primitive that any agent transport can call. Adding a new agent = one adapter subclass. This is what makes it a platform, not a feature.
Market โ cited, as of Jun 2026
A big wave, and a missing gate.
~$8B
AI-agents market (2025), ~45% CAGR โ ~$50B by 2030
100M+
x402 agentic payments on Base (through Q1 2026)
$33T
stablecoin settlement in 2025, +72% YoY
$75B
USDC market cap โ the agent currency
x402 is now a Linux Foundation project backed by Coinbase, Google, Stripe, AWS, Visa, Circle & the Solana Foundation. Wedge: developers running coding agents today โ any team deploying agents that spend or ship.
Sources: MarketsandMarkets ยท BCC ยท Chainalysis ยท Artemis ยท CoinMarketCap ยท DefiLlama ยท Linux Foundation (full citations in research/MARKET.md)
Speaker
Every figure is cross-referenced against โฅ2 sources (see presentation/research/MARKET.md), stamped as-of 2026-06-21 โ re-pull before presenting. The story: agent adoption + a real agent-payment rail (x402, now Linux Foundation) + massive stablecoin settlement = the human approval gate is the missing primitive. TAM/SAM/SOM arithmetic is in the research doc.
The ask
Back the supervision layer
for the agent era.
- MEGATHON judges โ recognize Mr. Potato in the Agents track.
- Design partners โ teams running agents who want a human gate. 5 pilots.
- Ecosystem โ adopt the confirm + x402 pattern as the standard agent approval primitive.
Mr. Potato โ agents do the work; humans stay in control. ยท
info@hfsp.xyz ยท hfsp.xyz ยท github.com/lpsmurf/mr-potato
Speaker
Make the ask specific and close with the one-liner. If this becomes an investor deck, swap to: amount + instrument + use of funds + 6-month milestone (e.g. "$X SAFE to reach N design partners and ship the hosted gate").