> ## Documentation Index
> Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Specialized models and subagents for AI coding agents

## The problem

Coding agents waste most of their compute on things that aren't reasoning.

Cognition [measured this](https://www.cognition.ai/blog/under-the-hood-how-devin-finds-the-right-code): their agent spent 60% of turns searching for code. Anthropic found multi-agent architectures [improve task completion by 90%](https://www.anthropic.com/engineering/swe-bench-sonnet) when mechanical work runs in specialized subprocesses. And if you've built an agent that edits files, you've seen the failure mode: your model rewrites a 500-line file to change 3 lines, burns tokens, and introduces drift.

These aren't reasoning problems. They're mechanical problems, and they have mechanical solutions.

| Without Morph                                                                        | With Morph                                                                                            |
| ------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------- |
| Agent rewrites a 500-line file to change 3 lines. Costs \~\$0.12, takes 8 seconds.   | Agent sends a 10-line edit snippet. Merged in under a second at 10,500 tok/s, 98% accuracy.           |
| `str_replace` fails when whitespace doesn't match. Agent re-reads the file, retries. | Fast Apply takes a lazy snippet. No re-reads, no exact-string matching.                               |
| Agent spends 60% of turns searching. Results fill the context window.                | WarpGrep searches in a separate context. Finds code in 3.8 steps. Main context stays clean.           |
| After 50 turns, chat history is 80% filler. Model starts forgetting.                 | Compact removes irrelevant lines at 33,000 tok/s. 50-70% reduction. Every surviving line is verbatim. |

## Get running in 30 seconds

Install the MCP server. One command, and `edit_file` + `codebase_search` appear in your editor.

<pre style={{backgroundColor: '#f3f4f6', padding: '14px 16px', borderRadius: '8px', overflowX: 'auto', fontFamily: 'ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace', fontSize: '14px', lineHeight: '1.5', color: '#111827', fontWeight: 'bold'}}><code>npx -y @morphllm/morph-setup --morph-api-key YOUR\_API\_KEY</code></pre>

This auto-detects Claude Code, Cursor, Codex, and VS Code, then configures them all.

<Tip>
  **Logged in?** Your API key auto-fills above. Otherwise, grab one from your [dashboard](https://morphllm.com/dashboard/api-keys).
</Tip>

<Card title="Full MCP setup guide" icon="plug" href="/mcpquickstart" horizontal>
  Per-client configuration, CLAUDE.md prompts, and troubleshooting
</Card>

## Building an agent? Use the SDK.

OpenAI-compatible API. Point any OpenAI SDK at `https://api.morphllm.com/v1`.

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { MorphClient } from '@morphllm/morphsdk';

  const morph = new MorphClient({ apiKey: process.env.MORPH_API_KEY });

  // Edit a file: 10,500 tok/s, 98% accuracy
  const edit = await morph.fastApply.execute({
    target_filepath: 'src/auth.ts',
    instructions: 'Add null check before session creation',
    code_edit: '// ... existing code ...\nif (!user) throw new Error("Not found");\n// ... existing code ...'
  });

  // Search a codebase: 3.8 steps, 8 parallel tool calls per turn
  const search = await morph.warpGrep.execute({
    searchTerm: 'Find authentication middleware',
    repoRoot: '.'
  });

  // Compress context: 33,000 tok/s, 50-70% reduction
  const compact = await morph.compact({
    input: chatHistory,
    query: 'JWT token validation'
  });
  ```

  ```python Python theme={null}
  from openai import OpenAI

  client = OpenAI(
      api_key="YOUR_API_KEY",
      base_url="https://api.morphllm.com/v1",
  )

  # Merge an edit snippet into a file
  response = client.chat.completions.create(
      model="morph-v3-fast",
      messages=[{
          "role": "user",
          "content": f"<instruction>{instructions}</instruction>\n<code>{original_code}</code>\n<update>{code_edit}</update>"
      }],
  )

  merged_code = response.choices[0].message.content
  ```
</CodeGroup>

```bash theme={null}
npm install @morphllm/morphsdk
```

## Products

| Product                                         | What it does                                | Speed        | Key metric                                                |
| ----------------------------------------------- | ------------------------------------------- | ------------ | --------------------------------------------------------- |
| **[Fast Apply](/quickstart)**                   | Merges edit snippets into files             | 10,500 tok/s | 98% accuracy                                              |
| **[WarpGrep](/sdk/components/warp-grep/index)** | Searches code in an isolated context window | \~3.8 steps  | [#1 SWE-Bench Pro](https://morphllm.com/blog/warpgrep-v2) |
| **[Compact](/sdk/components/compact)**          | Removes irrelevant lines from chat history  | 33,000 tok/s | 50-70% reduction, verbatim                                |
| **[Router](/sdk/components/router)**            | Routes prompts to the right model tier      | \~430ms      | \$0.001/request                                           |

<AccordionGroup>
  <Accordion title="Fast Apply: how it works" icon="bolt">
    Your agent describes a change as a lazy edit snippet (just the changed lines, with `// ... existing code ...` markers). Fast Apply merges that snippet into the original file and returns the result.

    98% accuracy. Sub-second latency on typical files. This is the same approach [Cursor uses](https://web.archive.org/web/20240823050616/https://www.cursor.com/blog/instant-apply).

    Unlike `str_replace`, the agent never re-reads the file or reproduces source code verbatim.

    Edit format alone is one of the highest-leverage variables in agent performance. [Can Boluk benchmarked 15 LLMs](https://blog.can.ac/2026/02/12/the-harness-problem/) and found that switching edit format, with zero training compute, improved Gemini's success rate by 8%, more than most model upgrades deliver. Grok Code went from 6.7% to 68.3% just by changing how edits were expressed.

    <Warning>
      If your agent omits `// ... existing code ...` markers, Fast Apply treats missing sections as deletions. Make sure your agent prompt includes the marker format. See the [quickstart](/quickstart) for prompt templates.
    </Warning>

    [Full guide →](/quickstart)
  </Accordion>

  <Accordion title="WarpGrep: how it works" icon="search">
    WarpGrep is a separate LLM that searches your codebase in its own context window. It takes a natural language query, issues 8 parallel tool calls per turn, and returns file/line-range spans in \~3.8 steps (under 6 seconds on most repos).

    The key detail: it runs in isolation. Your main agent's context stays clean. No 200-file grep dumps polluting the conversation.

    Paired with Opus, Codex, or MiniMax, WarpGrep reaches [#1 on SWE-Bench Pro](https://morphllm.com/blog/warpgrep-v2), 15.6% cheaper and 28% faster than single-model approaches.

    <Tip>
      WarpGrep also searches public GitHub repos without cloning. Pass a GitHub URL instead of a local path.
    </Tip>

    [Full guide →](/sdk/components/warp-grep/index)
  </Accordion>

  <Accordion title="Compact: how it works" icon="compress">
    Shrinks chat history and code context before sending it to your LLM. 100K tokens compress in under 2 seconds. 50-70% reduction. Every surviving line is byte-for-byte identical to the original.

    The optional `query` parameter makes compression much better. It tells the model what the user is about to ask, so `query="auth middleware"` keeps auth code and drops DB setup.

    1M token context window. You can compress entire repositories in a single call.

    [Full guide →](/sdk/components/compact)
  </Accordion>
</AccordionGroup>

## Common gotchas

<AccordionGroup>
  <Accordion title="My agent rewrites the whole file instead of using edit snippets">
    Fast Apply only helps if your agent outputs partial edits. You need to update your agent's system prompt to use `// ... existing code ...` markers. Without this, your agent generates full-file rewrites and there's nothing for Fast Apply to merge. See the [prompt templates](/quickstart).
  </Accordion>

  <Accordion title="WarpGrep results seem incomplete">
    WarpGrep needs [ripgrep](https://github.com/BurntSushi/ripgrep) installed locally for codebase search. If ripgrep isn't on PATH, searches will fail silently. GitHub search runs on the cloud and doesn't need ripgrep.
  </Accordion>

  <Accordion title="Compact is dropping lines I need">
    Use the `query` parameter. Without it, Compact makes generic compression decisions. With a specific query like `"database connection pooling"`, it keeps the relevant lines and drops the rest.
  </Accordion>

  <Accordion title="I'm using Python, not TypeScript">
    The Morph API is OpenAI-compatible. Use the OpenAI Python SDK, point it at `https://api.morphllm.com/v1`, and pass your Morph API key. See the [quickstart](/quickstart) for Python examples. WarpGrep has a dedicated [Python guide](/guides/warp-grep-python).
  </Accordion>
</AccordionGroup>

## If you're coming from...

<Tabs>
  <Tab title="Claude Code / Codex">
    Install the MCP server. `edit_file` and `codebase_search` appear as tools automatically. No code changes. [MCP quickstart →](/mcpquickstart)
  </Tab>

  <Tab title="Cursor">
    Cursor's apply feature [uses the same approach](https://web.archive.org/web/20240823050616/https://www.cursor.com/blog/instant-apply). Morph exposes it as an API for your own agents, CI pipelines, or any tool that edits code.
  </Tab>

  <Tab title="Aider / Continue">
    Fast Apply replaces search-and-replace blocks. Your agent outputs a lazy edit snippet instead of reproducing exact strings. No re-reads, no "String to replace not found" errors.
  </Tab>

  <Tab title="Building your own agent">
    Register three tools: `edit_file` (Fast Apply), `codebase_search` (WarpGrep), and context compression (Compact). All OpenAI-compatible. The [quickstart](/quickstart) has tool definitions you can copy directly.
  </Tab>
</Tabs>

## Next steps

<CardGroup cols={2}>
  <Card title="Fast Apply Quickstart" icon="bolt" href="/quickstart">
    Prompt templates, code examples, verification
  </Card>

  <Card title="WarpGrep Guide" icon="search" href="/sdk/components/warp-grep/index">
    Codebase search, GitHub search, streaming
  </Card>

  <Card title="Compact Guide" icon="compress" href="/sdk/components/compact">
    Query-conditioned compression, keepContext tags
  </Card>

  <Card title="MCP Integration" icon="plug" href="/mcpquickstart">
    Claude Code, Cursor, Codex, VS Code
  </Card>
</CardGroup>

<CardGroup cols={2}>
  <Card title="SDK Reference" icon="code" href="/sdk/reference">
    Full TypeScript SDK documentation
  </Card>

  <Card title="API Playground" icon="play" href="https://morphllm.com/dashboard/playground/apply">
    Test with live examples
  </Card>
</CardGroup>

## Enterprise

Dedicated instances, self-hosted deployments, zero data retention. 99.9% uptime SLA, SOC2, SSO.

<Card title="Talk to Sales" icon="envelope" href="mailto:info@morphllm.com" horizontal>
  Custom deployments and volume pricing
</Card>
