What is Code-Mode Execution in MCP?

Code-mode execution is an architecture for the Model Context Protocol (MCP) where the AI writes code that runs in a secure sandbox instead of loading tool schemas into its context window. According to Anthropic's Code Execution MCP research (January 2026), code-mode execution achieves 70-98% token savings compared to traditional MCP tool loading.

This approach was pioneered by MCPWorks as the foundation of its namespace-based function hosting platform for AI assistants.

How traditional MCP tool loading works

In the standard MCP architecture, every tool connected to an AI assistant must have its complete schema loaded into the AI's context window. This schema includes the tool's name, description, input parameters, output format, and usage examples.

For a single MCP server with 15 tools, this might consume 10,000-15,000 tokens. But developers rarely use just one server. A typical development setup might include:

A database server (10-20 tools)
A file system server (5-10 tools)
A web search server (3-5 tools)
A project management server (10-15 tools)
A deployment server (5-10 tools)

With 10 servers averaging 15 tools each, the AI's context window absorbs 150,000+ tokens of tool definitions before the user asks a single question. At current API pricing, this overhead adds significant cost to every interaction.

How code-mode execution works

Code-mode execution inverts this model. Instead of loading full tool schemas, the AI receives only a list of available function names and a single tool: a code execution sandbox.

The process works in three steps:

Function discovery — The AI receives a lightweight list of available functions (names and one-line descriptions). This consumes approximately 2,000 tokens instead of 150,000.
Code generation — When the AI needs to accomplish a task, it writes Python or TypeScript code that calls the available functions. The AI already knows how to write code — it doesn't need a full schema to construct function calls.
Sandbox execution — The generated code runs inside an isolated sandbox. Intermediate data (API responses, database results, file contents) stays within the sandbox. Only the final result returns to the AI's context.

This third step is what makes code-mode execution architecturally distinct: intermediate data never enters the AI's context window.

Why intermediate data isolation matters

Consider a task where an AI processes 50 customer orders. In the traditional MCP model:

The AI calls a "get orders" tool
All 50 order objects (perhaps 25,000 tokens) flow into the AI's context
The AI reasons over them and produces a summary
Total context consumed: tool schemas + 25,000 tokens of order data + prompt + response

In code-mode execution:

The AI writes a Python script that fetches and processes the 50 orders
The script runs in the sandbox
Only the summary ("Processed 50 orders, 3 flagged for review") returns to the AI
Total context consumed: function names + generated code + summary

This is both a cost optimization and a privacy feature. Sensitive data — customer records, financial figures, authentication tokens — stays in the sandbox rather than flowing through the AI's context where it might be logged or retained.

The 70-98% savings range

The savings range depends on the workload. Anthropic's research found:

70% savings on simple tasks where the AI needs to call a few functions with small responses
90%+ savings on data processing tasks where large datasets would otherwise enter the context
98% savings on batch operations where hundreds of API calls would each return data to the context

The primary variable is how much intermediate data the traditional approach would push into the AI's context. Tasks that involve large datasets or many sequential API calls see the highest savings.

Code-mode execution and MCPWorks

MCPWorks implements code-mode execution as the core of its function hosting platform. Developers create Python or TypeScript functions, MCPWorks hosts them in secure nsjail-isolated sandboxes with Linux namespaces, cgroups, and seccomp filtering, and any MCP-compatible AI client invokes them directly over HTTPS.

The sandbox isolation is critical for code-mode execution to work safely. AI-generated code runs in an environment with:

Process isolation via Linux namespaces
Resource limits via cgroups (CPU, memory, network)
System call filtering via seccomp (preventing dangerous operations)
Network isolation with controlled external access

This allows the AI to write and execute code without risk to the host system or other tenants.

When to use code-mode execution

Code-mode execution is most effective when:

You have many tools — The more tools connected, the greater the context savings
Tasks involve data processing — Batch operations, data transformation, and aggregation benefit most
Sensitive data is involved — Data stays in the sandbox rather than flowing through the AI's context
Cost is a concern — Token savings directly reduce AI API costs at scale

Traditional tool loading remains appropriate for simple setups with few tools and small data volumes, where the schema overhead is negligible.