> ## Documentation Index > Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt > Use this file to discover all available pages before exploring further. # Introduction > Specialized models and subagents for AI coding agents ## The problem Coding agents waste most of their compute on things that aren't reasoning. Cognition [measured 60% of turns spent searching](https://www.cognition.ai/blog/under-the-hood-how-devin-finds-the-right-code). Anthropic found specialized subprocesses [improve task completion by 90%](https://www.anthropic.com/engineering/swe-bench-sonnet). And every file-editing agent hits the same failure: it rewrites a 500-line file to change 3 lines, burning tokens and introducing drift. These aren't reasoning problems. They're mechanical problems, and they have mechanical solutions. | Without Morph | With Morph | | ------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- | | Agent rewrites a 500-line file to change 3 lines. Costs \~\$0.12, takes 8 seconds. | Agent sends a 10-line edit snippet. Merged in under a second at 10,500 tok/s, 98% accuracy. | | `str_replace` fails when whitespace doesn't match. Agent re-reads the file, retries. | Fast Apply takes a lazy snippet. No re-reads, no exact-string matching. | | Agent spends 60% of turns searching. Results fill the context window. | WarpGrep searches in a separate context. Finds code in 3.8 steps. Main context stays clean. | | After 50 turns, chat history is 80% filler. Model starts forgetting. | Compact removes irrelevant lines at 33,000 tok/s. 50-70% reduction. Every surviving line is verbatim. | ## Get running in 30 seconds One command installs the MCP server and adds `edit_file` + `codebase_search` to your editor. It auto-detects Claude Code, Cursor, Codex, and VS Code, then configures them all. ```bash Terminal theme={null} npx -y @morphllm/morph-setup --morph-api-key YOUR_API_KEY ``` **Logged in?** Your API key auto-fills above. Otherwise, grab one from your [dashboard](https://morphllm.com/dashboard/api-keys). Per-client configuration, CLAUDE.md prompts, and troubleshooting ## Building an agent? Use the SDK. OpenAI-compatible API. Point any OpenAI SDK at `https://api.morphllm.com/v1`. ```typescript TypeScript theme={null} import { MorphClient } from '@morphllm/morphsdk'; const morph = new MorphClient({ apiKey: process.env.MORPH_API_KEY }); // Edit a file: 10,500 tok/s, 98% accuracy const edit = await morph.fastApply.execute({ target_filepath: 'src/auth.ts', instructions: 'Add null check before session creation', code_edit: '// ... existing code ...\nif (!user) throw new Error("Not found");\n// ... existing code ...' }); // Search a codebase: 3.8 steps, 8 parallel tool calls per turn const search = await morph.warpGrep.execute({ searchTerm: 'Find authentication middleware', repoRoot: '.' }); // Compress context: 33,000 tok/s, 50-70% reduction const compact = await morph.compact({ input: chatHistory, query: 'JWT token validation' }); ``` ```python Python theme={null} from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1", ) # Merge an edit snippet into a file response = client.chat.completions.create( model="morph-v3-fast", messages=[{ "role": "user", "content": f"{instructions}\n{original_code}\n{code_edit}" }], ) merged_code = response.choices[0].message.content ``` ```bash theme={null} npm install @morphllm/morphsdk ``` ## Products | Product | What it does | Speed | Key metric | | ----------------------------------------------- | ------------------------------------------- | ------------ | --------------------------------------------------------- | | **[Fast Apply](/quickstart)** | Merges edit snippets into files | 10,500 tok/s | 98% accuracy | | **[WarpGrep](/sdk/components/warp-grep/index)** | Searches code in an isolated context window | \~3.8 steps | [#1 SWE-Bench Pro](https://morphllm.com/blog/warpgrep-v2) | | **[Compact](/sdk/components/compact)** | Removes irrelevant lines from chat history | 33,000 tok/s | 50-70% reduction, verbatim | | **[Router](/sdk/components/router)** | Routes prompts to the right model tier | \~50ms | \$0.005/request | | **[Reflexes](/sdk/components/reflexes)** | Classifies text for guardrails and routing | \~90ms | \$0.001/event | Your agent describes a change as a lazy edit snippet (just the changed lines, with `// ... existing code ...` markers). Fast Apply merges that snippet into the original file and returns the result. 98% accuracy. Sub-second latency on typical files. This is the same approach [Cursor uses](https://web.archive.org/web/20240823050616/https://www.cursor.com/blog/instant-apply). Unlike `str_replace`, the agent never re-reads the file or reproduces source code verbatim. Edit format is one of the highest-leverage variables in agent performance. [Can Boluk's 15-LLM benchmark](https://blog.can.ac/2026/02/12/the-harness-problem/) found Grok Code jumped from 6.7% to 68.3% just by changing how edits were expressed, no retraining. If your agent omits `// ... existing code ...` markers, Fast Apply treats missing sections as deletions. Make sure your agent prompt includes the marker format. See the [quickstart](/quickstart) for prompt templates. [Full guide →](/quickstart) WarpGrep is a separate LLM that searches your codebase in its own context window. It takes a natural language query, issues 8 parallel tool calls per turn, and returns file/line-range spans in \~3.8 steps (under 6 seconds on most repos). The key detail: it runs in isolation. Your main agent's context stays clean. No 200-file grep dumps polluting the conversation. Paired with Opus, Codex, or MiniMax, WarpGrep reaches [#1 on SWE-Bench Pro](https://morphllm.com/blog/warpgrep-v2), 15.6% cheaper and 28% faster than single-model approaches. WarpGrep also searches public GitHub repos without cloning. Pass a GitHub URL instead of a local path. [Full guide →](/sdk/components/warp-grep/index) Shrinks chat history and code context before sending it to your LLM. 100K tokens compress in under 2 seconds. 50-70% reduction. Every surviving line is byte-for-byte identical to the original. The optional `query` parameter makes compression much better. It tells the model what the user is about to ask, so `query="auth middleware"` keeps auth code and drops DB setup. 1M token context window. You can compress entire repositories in a single call. [Full guide →](/sdk/components/compact) Not every prompt needs a frontier model. The Router classifies a prompt's difficulty, ambiguity, and domain in \~50ms and tells you which model to call. Trained on millions of coding prompts. Send the prompt, get back a recommended model, then make your real call. \$0.005/request, up to 65,536 tokens of input. [Full guide →](/sdk/components/router) A Reflex is a small text classifier that returns a label in \~90ms, with no model to train or host. Eight ship ready to use: jailbreak and NSFW guardrails, leaked-thinking and stuck-in-a-loop detectors, plus difficulty and domain labels for routing. POST text to `/v1/reflex/predict` and get a score per class back. \$0.001/event, or train your own from labeled examples. [Full guide →](/sdk/components/reflexes) ## Common gotchas Fast Apply only helps if your agent outputs partial edits. You need to update your agent's system prompt to use `// ... existing code ...` markers. Without this, your agent generates full-file rewrites and there's nothing for Fast Apply to merge. See the [prompt templates](/quickstart). WarpGrep needs [ripgrep](https://github.com/BurntSushi/ripgrep) installed locally for codebase search. If ripgrep isn't on PATH, searches will fail silently. GitHub search runs on the cloud and doesn't need ripgrep. Use the `query` parameter. Without it, Compact makes generic compression decisions. With a specific query like `"database connection pooling"`, it keeps the relevant lines and drops the rest. The Morph API is OpenAI-compatible. Use the OpenAI Python SDK, point it at `https://api.morphllm.com/v1`, and pass your Morph API key. See the [quickstart](/quickstart) for Python examples. WarpGrep has a dedicated [Python guide](/guides/warp-grep-python). ## If you're coming from... Install the MCP server. `edit_file` and `codebase_search` appear as tools automatically. No code changes. [MCP quickstart →](/mcpquickstart) Cursor's apply feature [uses the same approach](https://web.archive.org/web/20240823050616/https://www.cursor.com/blog/instant-apply). Morph exposes it as an API for your own agents, CI pipelines, or any tool that edits code. Fast Apply replaces search-and-replace blocks. Your agent outputs a lazy edit snippet instead of reproducing exact strings. No re-reads, no "String to replace not found" errors. Register three tools: `edit_file` (Fast Apply), `codebase_search` (WarpGrep), and context compression (Compact). All OpenAI-compatible. The [quickstart](/quickstart) has tool definitions you can copy directly. ## Next steps Prompt templates, code examples, verification Codebase search, GitHub search, streaming Query-conditioned compression, keepContext tags Claude Code, Cursor, Codex, VS Code Full TypeScript SDK documentation Test with live examples ## Enterprise Dedicated instances, self-hosted deployments, zero data retention. 99.9% uptime SLA, SOC2, SSO. Custom deployments and volume pricing