Introduction - Morph Documentation

The problem

Coding agents waste most of their compute on things that aren’t reasoning. Cognition measured this: their agent spent 60% of turns searching for code. Anthropic found multi-agent architectures improve task completion by 90% when mechanical work runs in specialized subprocesses. And if you’ve built an agent that edits files, you’ve seen the failure mode: your model rewrites a 500-line file to change 3 lines, burns tokens, and introduces drift. These aren’t reasoning problems. They’re mechanical problems, and they have mechanical solutions.

Without Morph	With Morph
Agent rewrites a 500-line file to change 3 lines. Costs ~$0.12, takes 8 seconds.	Agent sends a 10-line edit snippet. Merged in under a second at 10,500 tok/s, 98% accuracy.
`str_replace` fails when whitespace doesn’t match. Agent re-reads the file, retries.	Fast Apply takes a lazy snippet. No re-reads, no exact-string matching.
Agent spends 60% of turns searching. Results fill the context window.	WarpGrep searches in a separate context. Finds code in 3.8 steps. Main context stays clean.
After 50 turns, chat history is 80% filler. Model starts forgetting.	Compact removes irrelevant lines at 33,000 tok/s. 50-70% reduction. Every surviving line is verbatim.

Get running in 30 seconds

Install the MCP server. One command, and edit_file + codebase_search appear in your editor.

npx -y @morphllm/morph-setup —morph-api-key YOUR_API_KEY

This auto-detects Claude Code, Cursor, Codex, and VS Code, then configures them all.

Logged in? Your API key auto-fills above. Otherwise, grab one from your dashboard.

Full MCP setup guide

Per-client configuration, CLAUDE.md prompts, and troubleshooting

Building an agent? Use the SDK.

OpenAI-compatible API. Point any OpenAI SDK at https://api.morphllm.com/v1.

import { MorphClient } from '@morphllm/morphsdk';

const morph = new MorphClient({ apiKey: process.env.MORPH_API_KEY });

// Edit a file: 10,500 tok/s, 98% accuracy
const edit = await morph.fastApply.execute({
  target_filepath: 'src/auth.ts',
  instructions: 'Add null check before session creation',
  code_edit: '// ... existing code ...\nif (!user) throw new Error("Not found");\n// ... existing code ...'
});

// Search a codebase: 3.8 steps, 8 parallel tool calls per turn
const search = await morph.warpGrep.execute({
  searchTerm: 'Find authentication middleware',
  repoRoot: '.'
});

// Compress context: 33,000 tok/s, 50-70% reduction
const compact = await morph.compact({
  input: chatHistory,
  query: 'JWT token validation'
});

npm install @morphllm/morphsdk

Products

Product	What it does	Speed	Key metric
Fast Apply	Merges edit snippets into files	10,500 tok/s	98% accuracy
WarpGrep	Searches code in an isolated context window	~3.8 steps	#1 SWE-Bench Pro
Compact	Removes irrelevant lines from chat history	33,000 tok/s	50-70% reduction, verbatim
Router	Routes prompts to the right model tier	~430ms	$0.001/request

Fast Apply: how it works

Your agent describes a change as a lazy edit snippet (just the changed lines, with // ... existing code ... markers). Fast Apply merges that snippet into the original file and returns the result.98% accuracy. Sub-second latency on typical files. This is the same approach Cursor uses.Unlike str_replace, the agent never re-reads the file or reproduces source code verbatim.Edit format alone is one of the highest-leverage variables in agent performance. Can Boluk benchmarked 15 LLMs and found that switching edit format, with zero training compute, improved Gemini’s success rate by 8%, more than most model upgrades deliver. Grok Code went from 6.7% to 68.3% just by changing how edits were expressed.

If your agent omits // ... existing code ... markers, Fast Apply treats missing sections as deletions. Make sure your agent prompt includes the marker format. See the quickstart for prompt templates.

Full guide →

WarpGrep: how it works

WarpGrep is a separate LLM that searches your codebase in its own context window. It takes a natural language query, issues 8 parallel tool calls per turn, and returns file/line-range spans in ~3.8 steps (under 6 seconds on most repos).The key detail: it runs in isolation. Your main agent’s context stays clean. No 200-file grep dumps polluting the conversation.Paired with Opus, Codex, or MiniMax, WarpGrep reaches #1 on SWE-Bench Pro, 15.6% cheaper and 28% faster than single-model approaches.

WarpGrep also searches public GitHub repos without cloning. Pass a GitHub URL instead of a local path.

Full guide →

Compact: how it works

Shrinks chat history and code context before sending it to your LLM. 100K tokens compress in under 2 seconds. 50-70% reduction. Every surviving line is byte-for-byte identical to the original.The optional query parameter makes compression much better. It tells the model what the user is about to ask, so query="auth middleware" keeps auth code and drops DB setup.1M token context window. You can compress entire repositories in a single call.Full guide →

Common gotchas

My agent rewrites the whole file instead of using edit snippets

Fast Apply only helps if your agent outputs partial edits. You need to update your agent’s system prompt to use // ... existing code ... markers. Without this, your agent generates full-file rewrites and there’s nothing for Fast Apply to merge. See the prompt templates.

WarpGrep results seem incomplete

WarpGrep needs ripgrep installed locally for codebase search. If ripgrep isn’t on PATH, searches will fail silently. GitHub search runs on the cloud and doesn’t need ripgrep.

Compact is dropping lines I need

Use the query parameter. Without it, Compact makes generic compression decisions. With a specific query like "database connection pooling", it keeps the relevant lines and drops the rest.

I'm using Python, not TypeScript

The Morph API is OpenAI-compatible. Use the OpenAI Python SDK, point it at https://api.morphllm.com/v1, and pass your Morph API key. See the quickstart for Python examples. WarpGrep has a dedicated Python guide.

If you’re coming from…

Claude Code / Codex
Cursor
Aider / Continue
Building your own agent

Install the MCP server. edit_file and codebase_search appear as tools automatically. No code changes. MCP quickstart →

Register three tools: edit_file (Fast Apply), codebase_search (WarpGrep), and context compression (Compact). All OpenAI-compatible. The quickstart has tool definitions you can copy directly.

Next steps

Fast Apply Quickstart

Prompt templates, code examples, verification

WarpGrep Guide

Codebase search, GitHub search, streaming

Compact Guide

Query-conditioned compression, keepContext tags

MCP Integration

Claude Code, Cursor, Codex, VS Code

SDK Reference

Full TypeScript SDK documentation

API Playground

Test with live examples

Enterprise

Dedicated instances, self-hosted deployments, zero data retention. 99.9% uptime SLA, SOC2, SSO.

Talk to Sales

Custom deployments and volume pricing

​The problem

​Get running in 30 seconds

Full MCP setup guide

​Building an agent? Use the SDK.

​Products

​Common gotchas

​If you’re coming from…

​Next steps

Fast Apply Quickstart

WarpGrep Guide

Compact Guide

MCP Integration

SDK Reference

API Playground

​Enterprise

Talk to Sales

The problem

Get running in 30 seconds

Building an agent? Use the SDK.

Products

Common gotchas

If you’re coming from…

Next steps

Enterprise