# Morph

Specialized models and subagents for AI coding agents. Fast Apply edits files at 10,500 tok/s, WarpGrep searches codebases in ~6s, Compact compresses context at 33,000 tok/s.

Base URL: `https://api.morphllm.com/v1` (OpenAI-compatible, works with any OpenAI SDK)

## When to use which product

| Need | Product | Model ID |
|------|---------|----------|
| Apply a code edit to a file | Fast Apply | `morph-v3-fast` (default) or `morph-v3-large` (complex edits) |
| Let the system pick fast vs large | Router + Apply | `auto` |
| Search a local codebase | WarpGrep | `morph-warp-grep-v1` |
| Search a public GitHub repo | WarpGrep (GitHub mode) | `morph-warp-grep-v1` |
| Compress chat history / context | Compact | `morph-compactor` |
| Generate code embeddings | Embedding | `morph-embedding-v4` |
| Rerank search results | Rerank | `morph-rerank-v4` |
| Route prompts by complexity | Router | `morph-routers` |

## Instructions for agents

- **All endpoints are OpenAI-compatible.** Use `base_url: https://api.morphllm.com/v1` with any OpenAI SDK.
- **Always set `temperature: 0`** for Apply and WarpGrep calls.
- **Apply is for edits, not file creation.** It merges a partial update into an existing file. Do not send empty `<code>` blocks.
- **Use `// ... existing code ...` markers** in update snippets to indicate unchanged regions. This is required.
- **Apply message format:** Single user message containing `<instruction>` (what the edit does), `<code>` (original file), and `<update>` (partial edit with markers) XML tags.
- **Include `<instruction>` in Apply calls.** Accuracy jumps from 92% to 98% when you describe what the edit does.
- **WarpGrep has built-in tools.** Do NOT pass a `tools` array in your request. The model uses `grep_search`, `read`, `list_directory`, `glob`, `finish` internally.
- **WarpGrep for local repos** requires `ripgrep` installed and a `<repo_structure>` block in the user message.
- **Compact preserves exact bytes.** Every surviving line is byte-for-byte identical to input. Use the `query` param to tell it what matters for the next LLM call.
- **Compact supports `<keepContext>` tags** to protect critical sections from compression.
- **Router returns a model recommendation**, not a completion. Use it to decide which downstream model to call, then call that model separately.
- **Never guess model names.** The current models are listed in the table above. Do not invent model IDs like `morph-v2` or `morph-fast`.

## Gotchas

- Sending the full updated file as `<update>` instead of a partial snippet wastes tokens and defeats the purpose of Apply.
- Forgetting `// ... existing code ...` markers causes Apply to treat the snippet as the complete file content.
- Passing a `tools` array to WarpGrep overrides its built-in tools and breaks search.
- Using Apply to create new files (empty `<code>`) gives poor results. Write the file directly instead.
- Compact without `query` uses the last user message as the relevance signal. For best compression, always pass `query` explicitly.

## Apply example

```python
from openai import OpenAI

client = OpenAI(base_url="https://api.morphllm.com/v1", api_key="your-key")

response = client.chat.completions.create(
    model="morph-v3-fast",
    messages=[{
        "role": "user",
        "content": "<instruction>Add error handling for division by zero</instruction>\n<code>function divide(a, b) {\n  return a / b;\n}</code>\n<update>function divide(a, b) {\n  if (b === 0) throw new Error('Division by zero');\n  return a / b;\n}</update>"
    }],
    temperature=0
)
```

## Resources

- Full documentation index: https://docs.morphllm.com/llms.txt
- Complete docs in one file (~9k tokens): https://docs.morphllm.com/llms-full.txt
- MCP server: `npx @morphllm/morphmcp@latest`
- Dashboard & API keys: https://morphllm.com/dashboard