Why we built Execution Protocol

Most agent platforms answer the wrong question.

The question they answer is "how do I let a model call my tools?" The answer they give — function calling, tool schemas, OpenAPI bindings — is fine for a demo. Past the demo, in production, with real money or real systems on the other end, it stops being fine.

The right question is "how do I let a model call my tools safely, with proof, and within limits?" That's a much harder problem. It's also a different layer.

The layer that says no

When we started looking at what was missing, we kept landing on the same gap. Every framework had a way to expose tools. None had a way to not expose tools. None had a way to express "this agent can do these things, up to this dollar amount, until this time, and only with these humans signing off when it gets close to the limit." None had a way to emit proof that an action committed — proof that a third party could verify offline, without trusting us.

That's the layer that says no. The Authorization Boundary. The delegation token. The signed receipt. They're all the same thing from different angles: explicit, deterministic, verifiable controls between a model and the systems it touches. Other tools filter what the model says. Execution controls what the model is allowed to do.

What boring buys you

We took inspiration from the protocols that became the default in their categories. Stripe in payments. PostgreSQL in databases. AWS S3 in storage. None of them are exciting. All of them are boring in exactly the same way: predictable, reliable, well-documented, no surprises, no drama.

Boring is what enterprises buy. Boring is what compliance officers reference in audits. Boring is what an SRE wants on the other end of an API call at 3am. The opposite of boring — clever, novel, full of surprises — is what kills production systems.

So we set out to be boring. The protocol is a deterministic pipeline, in order, every time. The error responses include a remediation array. The receipts are self-contained and verifiable with a published key. The threat model is public. None of this is novel; all of it is missing from the agent infrastructure landscape.

What's not here

We're not building a framework. There are good ones already (LangChain, AutoGPT, Claude SDK, OpenAI Assistants). The protocol is framework-agnostic — wrap your tool calls in a delegation token, and the Authorization Boundary does the rest.

We're not building a model. The model layer is its own thing. Models generate intentions; the protocol commits actions. Different concerns, different layers.

We're not selling a vision of AI safety. We're shipping a specific layer that handles a specific problem: turning agent intent into committed action with proof and policy. The vision is for the next person to articulate. We're heads-down on the layer.

What's next

The sandbox is live at /demos/sandbox — try a successful action and a blocked action without signing up, and verify the resulting receipt offline. The open specification is at /docs/spec. The hosted production gateway is in private beta with a small group of design partners; /signup is the waitlist.

If you're planning to run agents in production and the question of "what happens when one of them does something it shouldn't" keeps you up, we built this for you.