When MCP is worth the token cost, when it's not, and why the separation between reasoning and action is the only argument that matters.
MCP's value is one thing: an explicit boundary between the model reasoning and the system acting.
Everything on the left is thinking. Everything on the right is doing. They're in separate processes with separate permissions, separate logs, separate failure modes. Without MCP, thinking and doing happen in the same undifferentiated stream.
MCP is not inherently token-efficient. One measured setup: 4 servers, 58 tools, 55,000 tokens consumed before a single prompt. That's half the context window gone.
85% reduction with Tool Search. Source
Tool Search helps. But the real token savings from MCP aren't at the schema layer — they're at the result layer. A tool returning pending: 142 (5 tokens) vs psql dumping a formatted ASCII table (500+ tokens) for the same data.
Short version: MCP is worth the cost when you need the boundary. The boundary gives you auditability, credential isolation, and failure isolation. If you don't need those, a Bash command or a skill is simpler and cheaper.
| Scenario | Right tool | Why |
|---|---|---|
| Query a production database | MCP | Credentials stay server-side. Every query logged. |
| Explain your codebase conventions | Skill / CLAUDE.md | Needs to be IN context to affect reasoning. |
Run git status | Bash | No credentials, no audit need, no state. |
| Handle payment API keys | MCP | Structural security: model never sees sk_live_... |
| Format a markdown table | Skill | String manipulation doesn't need IPC. |
| Search a 10K-doc knowledge base | MCP | Persistent index in server memory. Structured results. |
| Deep multi-step research | Agent | Needs its own context window for extended reasoning. |
In regulated environments, "what did the AI do?" needs to be answerable from structured logs, not by reading a conversation transcript.
| Compliance need | MCP gives you | Skills give you |
|---|---|---|
| "What actions did the AI take?" | Structured event log | Parse a transcript |
| "Did it access credentials?" | Opaque handles — it never had them | Hope the model didn't print them |
| "Can we replay what happened?" | Same params = same result | Context-dependent, non-deterministic |
| "Can we restrict what it can do?" | Server-side permission checks | Prompt-level "please don't" |
Four dimensions for evaluating whether a tool respects the model's limited attention budget:
If you stop using this tool tomorrow, what do you keep?
cat the data it produces? Can you open it in another app? If the answer is no, you haven't built a tool. You've built a dependency.
Claude Code spawns MCP servers without sourcing your shell profile. python3 resolves to system Python. npx isn't found. Every environment variable from .zshrc is missing.
# WRONG (will fail):
{ "command": "python3", "args": ["server.py"] }
# RIGHT:
{ "command": "/home/you/.conda/envs/myenv/bin/python3",
"args": ["/absolute/path/to/server.py"],
"env": { "PYTHONPATH": "/absolute/path/to/project" } }
# For npx — always add -y (no interactive prompt in stdio mode):
{ "command": "/home/you/.nvm/versions/node/v22/bin/npx",
"args": ["-y", "@some/mcp-server@latest"] }
env -i /absolute/path/to/python3 /absolute/path/to/server.py. If it works in a bare shell with zero environment, it works in Claude Code. If it doesn't, it won't.
MCP's value isn't token savings. It's the explicit boundary between reasoning and action. If you'd put a code review gate between "the AI decided to do X" and "X actually happened" — that gate is an MCP server. For everything else, there's a simpler option. Use it.