The Token Efficiency Manifesto

The problem

Every time an AI agent invokes a tool through the Model Context Protocol, it loads the tool's full JSON schema into its context window. A single MCP server with 20 tools can consume 40,000 tokens before the agent does any real work. Those tokens cost money. They consume compute. And they reduce the space available for actual reasoning.

The industry has normalized this waste. Agent frameworks ship with default configurations that load every tool schema on every call. Developers pay for it in their LLM bills. The models pay for it in reduced context for the task at hand. Nobody talks about it because the cost is distributed across millions of API calls, invisible until the monthly invoice arrives.

The cost

At current pricing across major LLM providers, 40,000 wasted tokens per tool call costs $0.01-$0.12 depending on the model. That sounds small. Run 1,000 agent executions per day and the waste is $10-$120 daily — $300-$3,600 per month — on tokens that contribute nothing to the output. For teams running production agents at scale, token waste is a line item that rivals infrastructure costs.

What we believe

Token efficiency is not an optimization. It is an obligation to the developers paying the bills and to the compute infrastructure sustaining AI. Every token should earn its place in the context window. If a model can accomplish the same task with 1,200 tokens instead of 41,000, the 39,800 token difference is not a "nice to have" — it is waste that should not exist.

How we fix it

MCPWorks uses code-mode execution. Instead of loading verbose tool schemas into the context window and asking the model to format structured JSON calls, the AI writes Python or TypeScript code that executes directly in a secure nsjail sandbox. The model already knows how to write code. It does not need a schema to tell it how.

The result: 70-98% fewer context tokens for the same operations. The model spends its context budget on reasoning, not on parsing tool definitions it has seen thousands of times during training.

Why we open-sourced it

Token waste is an industry problem, not a single-company problem. Gating the solution behind a paywall would help our customers but leave the ecosystem unchanged. We released the MCPWorks platform under BSL 1.1 because the tools to fix this should be available to everyone who builds with AI agents. Self-host it, modify it, run it in production. The license converts to Apache 2.0 after 4 years.

For teams that want managed hosting without the operational burden, MCPWorks Cloud runs the same open-source code with namespaces, subdomains, and SLA guarantees. That is how we sustain the project. But the code is yours.

The commitment

We will measure and publish the token efficiency of every feature we ship. We will not add bloat to the context window to sell a premium tier. We will treat token waste the way the industry should have been treating it from the start: as a defect, not a feature.

View on GitHub Read the Blog