# Apply API Source: https://docs.morphllm.com/api-reference/endpoint/apply POST /v1/chat/completions Apply code edits at 10,500 tok/s with 98% accuracy via OpenAI-compatible API ## Overview The Apply API enables lightning-fast code editing at **10,500+ tokens/second** with **98% accuracy**. This OpenAI-compatible endpoint intelligently merges code changes while preserving structure and formatting. ## Models Choose the model that best fits your use case: Model Speed Accuracy Best For morph-v3-fast 10,500+ tok/sec 96% Real-time applications, quick edits morph-v3-large 5000+ tok/sec 98% Complex changes, highest accuracy auto 5000-10,500tok/sec \~98% Recommended - automatically selects optimal model
## Message Format The Apply API uses a structured XML format within the message content: ``` Brief description of what you're changing Original code content Code snippet showing only the changes with // ... existing code ... markers ``` ### Format Guidelines * **``**: Optional but recommended. Use first-person, clear descriptions * **``**: The complete original code that needs modification * **``**: Show only what changes, using `// ... existing code ...` for unchanged sections ## Usage Examples ```typescript TypeScript highlight={13} theme={null} import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "YOUR_API_KEY", baseURL: "https://api.morphllm.com/v1", }); const instruction = "I will add error handling to prevent division by zero"; const originalCode = "function divide(a, b) {\n return a / b;\n}"; const codeEdit = "function divide(a, b) {\n if (b === 0) {\n throw new Error('Cannot divide by zero');\n }\n return a / b;\n}"; const response = await openai.chat.completions.create({ model: "morph-v3-fast", messages: [ { role: "user", content: `${instruction}\n${originalCode}\n${codeEdit}`, }, ], }); const mergedCode = response.choices[0].message.content; ``` ```python Python highlight={14} theme={null} import os from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1" ) instruction = "I will add error handling to prevent division by zero" original_code = "function divide(a, b) {\n return a / b;\n}" code_edit = "function divide(a, b) {\n if (b === 0) {\n throw new Error('Cannot divide by zero');\n }\n return a / b;\n}" response = client.chat.completions.create( model="morph-v3-fast", messages=[ { "role": "user", "content": f"{instruction}\n{original_code}\n{code_edit}" } ] ) merged_code = response.choices[0].message.content ``` ```bash cURL highlight={9} theme={null} curl -X POST "https://api.morphllm.com/v1/chat/completions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "morph-v3-fast", "messages": [ { "role": "user", "content": "I will add error handling to prevent division by zero\nfunction divide(a, b) {\n return a / b;\n}\nfunction divide(a, b) {\n if (b === 0) {\n throw new Error(\"Cannot divide by zero\");\n }\n return a / b;\n}" } ] }' ``` ## Error Codes HTTP Status Description 200 Success - chat completion response 400 Bad request - malformed request or parameters 401 Authentication error - invalid API key
Build AI agent tools with Morph Apply See more implementation patterns # Compact API Source: https://docs.morphllm.com/api-reference/endpoint/compact POST /v1/compact Compress chat history and code context at 33,000 tok/s with byte-identical output ## Overview Compact compresses chat history and code context at **33,000 tok/s** by removing irrelevant lines. Every surviving line is byte-for-byte identical to the original input. 100K tokens compresses in under 2 seconds. Pass `query` to tell the model what matters for the next LLM call. Without it, the model auto-detects from the last user message. ## Usage Examples ```typescript TypeScript theme={null} import { MorphClient } from '@morphllm/morphsdk'; const morph = new MorphClient({ apiKey: "YOUR_API_KEY" }); const result = await morph.compact({ input: chatHistory, query: "How do I validate JWT tokens?", compressionRatio: 0.5, preserveRecent: 3, }); // result.output is the compressed text β€” pass it to your LLM ``` ```python Python (OpenAI SDK) theme={null} from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1", ) response = client.chat.completions.create( model="morph-compactor", messages=[{"role": "user", "content": chat_history}], ) compressed = response.choices[0].message.content ``` ```python Python (requests) theme={null} import requests response = requests.post( "https://api.morphllm.com/v1/compact", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={ "input": source_code, "query": "authentication", "compression_ratio": 0.5, "preserve_recent": 0, }, ) data = response.json() print(data["output"]) for r in data["messages"][0]["compacted_line_ranges"]: print(f" lines {r['start']}-{r['end']} removed") ``` ```bash cURL theme={null} curl -X POST "https://api.morphllm.com/v1/compact" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": "def hello():\n return 1\n\ndef unused():\n pass\n\ndef world():\n return 2", "query": "hello function", "compression_ratio": 0.5, "preserve_recent": 0 }' ``` ## keepContext Tags Wrap sections you never want compressed in `` / `` tags. Tagged content survives compression verbatim regardless of the compression ratio. ``` // CRITICAL: Auth middleware β€” do not compress function authenticate(req, res, next) { const token = req.headers.authorization?.split(' ')[1]; if (!token) return res.status(401).json({ error: 'No token' }); req.user = jwt.verify(token, process.env.JWT_SECRET); next(); } ``` The response includes `kept_line_ranges` showing which lines were force-preserved. ## Compatible Endpoints Compact also works through OpenAI-compatible endpoints with `model: "morph-compactor"`: | Endpoint | Format | Use with | | --------------------------- | ----------------------- | -------------------------------------------- | | `POST /v1/compact` | Native Morph format | Direct HTTP, Morph SDK | | `POST /v1/responses` | OpenAI Responses API | Any OpenAI SDK (`client.responses.create()`) | | `POST /v1/chat/completions` | OpenAI Chat Completions | Any OpenAI-compatible client | See the full [Compact documentation](/sdk/components/compact) for SDK reference, best practices, and advanced usage. # Code Apply API Source: https://docs.morphllm.com/api-reference/endpoint/direct POST /v1/code/apply Direct code apply endpoint with structured parameters for automated workflows ## Overview The Code Apply API provides a direct interface for applying code edits using the Morph model. This endpoint intelligently merges code changes at **10,500+ tokens/second** with **99.2% accuracy**, designed specifically for AI agents and development tools. Unlike the chat-based API, this endpoint accepts structured parameters directly, making it easier to integrate into automated workflows and development environments. ## Quickstart Add the `edit_file` tool to your agent. Use one of the formats below. ````xml Tool Description theme={null} Use this tool to make an edit to an existing file. This will be read by a less intelligent model, which will quickly apply the edit. You should make it clear what the edit is, while also minimizing the unchanged code you write. When writing the edit, you should specify each edit in sequence, with the special comment // ... existing code ... to represent unchanged code in between edited lines. For example: // ... existing code ... FIRST_EDIT // ... existing code ... SECOND_EDIT // ... existing code ... THIRD_EDIT // ... existing code ... You should still bias towards repeating as few lines of the original file as possible to convey the change. But, each edit should contain minimally sufficient context of unchanged lines around the code you're editing to resolve ambiguity. DO NOT omit spans of pre-existing code (or comments) without using the // ... existing code ... comment to indicate its absence. If you omit the existing code comment, the model may inadvertently delete these lines. If you plan on deleting a section, you must provide context before and after to delete it. If the initial code is ```code \n Block 1 \n Block 2 \n Block 3 \n code```, and you want to remove Block 2, you would output ```// ... existing code ... \n Block 1 \n Block 3 \n // ... existing code ...```. Make sure it is clear what the edit should be, and where it should be applied. Make edits to a file in a single edit_file call instead of multiple edit_file calls to the same file. The apply model can handle many distinct edits at once. ```` **Parameters:** * `target_file` (string, required): The target file to modify * `instructions` (string, required): A single sentence written in the first person describing what you're changing. Used to help disambiguate uncertainty in the edit. * `code_edit` (string, required): Specify ONLY the precise lines of code that you wish to edit. Use `// ... existing code ...` for unchanged sections. ````json Tool Definition theme={null} { "name": "edit_file", "description": "Use this tool to make an edit to an existing file.\n\nThis will be read by a less intelligent model, which will quickly apply the edit. You should make it clear what the edit is, while also minimizing the unchanged code you write.\nWhen writing the edit, you should specify each edit in sequence, with the special comment // ... existing code ... to represent unchanged code in between edited lines.\n\nFor example:\n\n// ... existing code ...\nFIRST_EDIT\n// ... existing code ...\nSECOND_EDIT\n// ... existing code ...\nTHIRD_EDIT\n// ... existing code ...\n\nYou should still bias towards repeating as few lines of the original file as possible to convey the change.\nBut, each edit should contain minimally sufficient context of unchanged lines around the code you're editing to resolve ambiguity.\nDO NOT omit spans of pre-existing code (or comments) without using the // ... existing code ... comment to indicate its absence. If you omit the existing code comment, the model may inadvertently delete these lines.\nIf you plan on deleting a section, you must provide context before and after to delete it. If the initial code is ```code \\n Block 1 \\n Block 2 \\n Block 3 \\n code```, and you want to remove Block 2, you would output ```// ... existing code ... \\n Block 1 \\n Block 3 \\n // ... existing code ...```.\nMake sure it is clear what the edit should be, and where it should be applied.\nMake edits to a file in a single edit_file call instead of multiple edit_file calls to the same file. The apply model can handle many distinct edits at once.", "input_schema": { "type": "object", "properties": { "target_file": { "type": "string", "description": "Name or path of target file to modify." }, "instructions": { "type": "string", "description": "A single sentence instruction describing what you are going to do for the sketched edit. This is used to assist the less intelligent model in applying the edit. Use the first person to describe what you are going to do. Use it to disambiguate uncertainty in the edit." }, "code_edit": { "type": "string", "description": "Specify ONLY the precise lines of code that you wish to edit. NEVER specify or write out unchanged code. Instead, represent all unchanged code using the comment of the language you're editing in - example: // ... existing code ..." } }, "required": ["target_file", "instructions", "code_edit"] } } ```` Instead of using tool calls, you can have the agent output code edits in markdown format that you can parse: ````markdown Agent Instruction theme={null} Use this approach to make edits to existing files by outputting code edits in a specific markdown format. This will be read by a less intelligent model, which will quickly apply the edit. You should make it clear what the edit is, while also minimizing the unchanged code you write. When writing the edit, you should specify each edit in sequence, with the special comment // ... existing code ... to represent unchanged code in between edited lines. For example: // ... existing code ... FIRST_EDIT // ... existing code ... SECOND_EDIT // ... existing code ... THIRD_EDIT // ... existing code ... You should still bias towards repeating as few lines of the original file as possible to convey the change. But, each edit should contain minimally sufficient context of unchanged lines around the code you're editing to resolve ambiguity. DO NOT omit spans of pre-existing code (or comments) without using the // ... existing code ... comment to indicate its absence. If you omit the existing code comment, the model may inadvertently delete these lines. If you plan on deleting a section, you must provide context before and after to delete it. If the initial code is ```code \n Block 1 \n Block 2 \n Block 3 \n code```, and you want to remove Block 2, you would output ```// ... existing code ... \n Block 1 \n Block 3 \n // ... existing code ...```. Make sure it is clear what the edit should be, and where it should be applied. Make edits to a file in a single response instead of multiple responses to the same file. The apply model can handle many distinct edits at once. When you want to edit a file, output your code edits using this markdown format: ```filepath=path/to/file.js instruction=A single sentence describing what you're changing // ... existing code ... YOUR_CODE_EDIT_HERE // ... existing code ... ``` The instruction should be written in the first person describing what you're changing. Used to help disambiguate uncertainty in the edit. ```` **IMPORTANT:** The `instructions` param should be generated by the model, not hardcoded. Example: "I am adding error handling to the user auth and removing the old auth functions" Send the original code and edit snippet to the Code Apply endpoint: ```python theme={null} import requests url = "https://api.morphllm.com/v1/code/apply" api_key = "[YOUR_API_KEY]" headers = { "Authorization": f"Bearer {api_key}", "Content-Type": "application/json" } data = { "initial_code": initial_code, "edit_snippet": edit_snippet, } response = requests.post(url, headers=headers, json=data) return response.json() ``` Extract the final merged code from the response: ```python theme={null} merged_code = response.json()["merged_code"] ``` **Response format:** ```json theme={null} { "merged_code": "string", "usage": { "prompt_tokens": "number", "completion_tokens": "number", "total_tokens": "number" } } ``` ## Models Choose the model that best fits your use case: Model Speed Accuracy Best For morph-v3-fast 10,500+ tok/sec 97% Real-time applications, best for most coding agents and files morph-v3-large 5000+ tok/sec 98.8% Complex changes, highest accuracy, best for complex edits auto 5000-10,500tok/sec \~98.8% Recommended - automatically selects optimal model
## Request Format ```json theme={null} { "initial_code": "string", "edit_snippet": "string", "instructions": "string (optional)", "model": "string (optional)", "stream": "boolean (optional)" } ``` ### Parameters * **`initial_code`** (required): The complete original code that needs modification * **`edit_snippet`** (required): Code snippet showing the changes with `// ... existing code ...` markers for unchanged sections * **`instructions`** (optional): Brief description of what you're changing to help disambiguate the edit * **`model`** (optional): Model to use (`morph-v3-fast`, `morph-v3-large`, or `auto` - defaults to `auto`) * **`stream`** (optional): Whether to stream the response (defaults to `false`) ## Response Format ### Non-Streaming Response ```json theme={null} { "mergedCode": "string", "usage": { "prompt_tokens": "number", "completion_tokens": "number", "total_tokens": "number" } } ``` ### Streaming Response For streaming requests (`stream: true`), the response follows the Server-Sent Events (SSE) format with incremental code updates. ## Example Request ```bash theme={null} curl -X POST "https://api.morphllm.com/v1/code/apply" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "initial_code": "function divide(a, b) {\n return a / b;\n}", "edit_snippet": "function divide(a, b) {\n if (b === 0) {\n throw new Error('\''Cannot divide by zero'\'');\n }\n return a / b;\n}", "instructions": "Add error handling to prevent division by zero" }' ``` ## Example Response ```json theme={null} { "merged_code": "function divide(a, b) {\n if (b === 0) {\n throw new Error('Cannot divide by zero');\n }\n return a / b;\n}", "usage": { "prompt_tokens": 45, "completion_tokens": 28, "total_tokens": 73 } } ``` ## Error Codes HTTP Status Error Code Description 200 - Success - code successfully applied 400 bad\_request Bad request - missing required parameters or malformed request 401 unauthorized Authentication required - invalid or missing API key 500 code\_apply\_error Internal error during code application 503 service\_unavailable Model not available - service temporarily unavailable
## Key Features * **High Performance**: Up to 10,500+ tokens/second with morph-v3-fast * **High Accuracy**: 99.2% accuracy with intelligent code merging * **Preserves Structure**: Maintains code formatting, indentation, and comments * **Streaming Support**: Real-time streaming for large code changes * **Multiple Models**: Choose between speed and accuracy based on your needs * **Direct Integration**: Simple JSON API designed for automated workflows Learn how to integrate the Code Apply API into your workflow Use the OpenAI-compatible chat interface instead # Tab Next Action Prediction API Source: https://docs.morphllm.com/api-reference/endpoint/donotshare Tab Next Action Prediction API endpoints ## Base URL ``` http://192.222.50.238:8080 ``` faster proxy endpoint: (in progress) ``` http://192.222.50.238:9000 ``` *** ## Health Check Check server status and cache performance. ```http theme={null} GET /health ``` ```bash cURL theme={null} curl http://192.222.50.238:8080/health ``` ```python Python theme={null} import requests response = requests.get("http://192.222.50.238:8080/health") print(response.json()) ``` ```javascript JavaScript theme={null} const response = await fetch('http://192.222.50.238:8080/health'); const data = await response.json(); ``` ### Response ```json theme={null} { "status": "healthy", "server_role": "standalone", "model": "morph-test", "gpu_available": true, "cache_enabled": true, "cache_stats": { "enabled": true, "hit_rate": 0.92, "num_cached_tokens": 15420 }, "uptime_seconds": 3847.2 } ``` Service status: `healthy` or `degraded` Server role: `standalone`, `prefiller`, or `decoder` Model name being served Whether GPU is available and initialized Whether prefix caching is enabled Cache performance statistics (if caching enabled) Cache status Cache hit rate (0.0 - 1.0) Number of tokens currently cached Server uptime in seconds *** ## Generate Prediction Generate next action prediction from a prompt. ```http theme={null} POST /v1/predict ``` ```bash cURL theme={null} curl -X POST http://192.222.50.238:8080/v1/predict \ -H "Content-Type: application/json" \ -d '{ "prompt": "{\"type\":3,\"data\":{\"source\":2,\"type\":6,\"id\":42,\"x\":385,\"y\":127}}\n{\"type\":3,\"data\":{\"source\":2,\"type\":2,\"id\":42,\"x\":385,\"y\":127,\"pointerType\":0}}\n{\"type\":3,\"data\":{\"source\":2,\"type\":1,\"id\":56}}\n{\"type\":3,\"data\":{\"source\":5,\"text\":\"user@example.com\",\"isChecked\":false,\"id\":56}}", "max_tokens": 50, "temperature": 0.3 }' ``` ```python Python theme={null} import requests # rrweb events as prompt rrweb_events = """{"type":3,"data":{"source":2,"type":6,"id":42,"x":385,"y":127}} {"type":3,"data":{"source":2,"type":2,"id":42,"x":385,"y":127,"pointerType":0}} {"type":3,"data":{"source":2,"type":1,"id":56}} {"type":3,"data":{"source":5,"text":"user@example.com","isChecked":false,"id":56}}""" response = requests.post( "http://192.222.50.238:8080/v1/predict", json={ "prompt": rrweb_events, "max_tokens": 50, "temperature": 0.3 } ) print(response.json()) ``` ```javascript JavaScript theme={null} // rrweb events as prompt const rrwebEvents = `{"type":3,"data":{"source":2,"type":6,"id":42,"x":385,"y":127}} {"type":3,"data":{"source":2,"type":2,"id":42,"x":385,"y":127,"pointerType":0}} {"type":3,"data":{"source":2,"type":1,"id":56}} {"type":3,"data":{"source":5,"text":"user@example.com","isChecked":false,"id":56}}`; const response = await fetch('http://192.222.50.238:8080/v1/predict', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ prompt: rrwebEvents, max_tokens: 50, temperature: 0.3 }) }); const data = await response.json(); ``` ```python Python (batch events) theme={null} import requests # Send batch of rrweb events rrweb_batch = [ {"type": 3, "data": {"source": 2, "type": 6, "id": 42, "x": 385, "y": 127}}, {"type": 3, "data": {"source": 2, "type": 2, "id": 42, "x": 385, "y": 127, "pointerType": 0}}, {"type": 3, "data": {"source": 2, "type": 1, "id": 56}}, {"type": 3, "data": {"source": 5, "text": "user@example.com", "isChecked": False, "id": 56}} ] # Convert to newline-delimited JSON string prompt = "\n".join([str(event) for event in rrweb_batch]) response = requests.post( "http://192.222.50.238:8080/v1/predict", json={ "prompt": prompt, "max_tokens": 50, "temperature": 0.3 } ) ``` ### Request Body rrweb event data as newline-delimited JSON. Each line should be a valid rrweb event object Maximum number of tokens to generate (range: 1-512) Sampling temperature (range: 0.0-2.0). Lower values produce more deterministic outputs Enable streaming response (currently not implemented) ### Response ```json theme={null} { "text": "{\"type\":3,\"data\":{\"source\":2,\"type\":1,\"id\":67}}\n{\"type\":3,\"data\":{\"source\":5,\"text\":\"password123\",\"isChecked\":false,\"id\":67}}", "latency_ms": 287, "tokens_generated": 42 } ``` Generated rrweb event predictions as newline-delimited JSON Request processing latency in milliseconds Number of tokens generated in the response *** ## Error Responses All errors return JSON with a standard format: ```json theme={null} { "detail": "Error message describing what went wrong" } ``` ### Status Codes Request completed successfully Invalid request parameters (e.g., temperature out of range) Model not ready or server not initialized Unexpected server error during prediction *** ## Performance Tips **Optimize Cache Hits**: Send rrweb events in consistent session sequences to maximize prefix cache reuse. Events from the same session with consistent ordering will achieve higher cache hit rates and lower latency. **Typical Latency**: * Single-node: \~800ms (P50), \~1.5s (P99) * Disaggregated: \~250ms (P50), \~450ms (P99) (in progress) * Cache hit rate of 90%+ dramatically reduces latency for similar event sequences ## rrweb Event Format The API expects rrweb events as newline-delimited JSON strings. Common event types: * **Type 2 (Meta)**: Page metadata and viewport info * **Type 3 (Incremental)**: User interactions (clicks, input, scroll, etc.) * `source: 2` = MouseInteraction * `source: 5` = Input * `source: 3` = MouseMove * **Type 4 (IncrementalSnapshot)**: DOM mutations Example event structure: ```json theme={null} { "type": 3, "data": { "source": 2, "type": 2, "id": 42, "x": 385, "y": 127, "pointerType": 0 } } ``` # Embedding API Source: https://docs.morphllm.com/api-reference/endpoint/embedding POST /v1/embeddings Generate code and text embeddings with morph-embedding-v4 via OpenAI-compatible API **Planned for deprecation.** The Embedding API will be removed in a future release. For code search, use [WarpGrep](/api-reference/endpoint/warpgrep) instead. WarpGrep is a search agent that handles retrieval, ranking, and file reading in one call, replacing the need to manage embeddings, vector databases, and reranking pipelines yourself. ## Overview Morph provides an OpenAI-compatible API for generating embeddings from code and text. State of the art on code retrieval tasks with our latest `morph-embedding-v4` model. ## Example Request ```typescript embedding.ts theme={null} import { OpenAI } from 'openai'; const client = new OpenAI({ apiKey: 'YOUR_API_KEY', baseURL: 'https://api.morphllm.com/v1' }); async function generateEmbeddings() { const response = await client.embeddings.create({ model: "morph-embedding-v4", input: "function calculateSum(a, b) { return a + b; }" }); return response.data[0].embedding; } ``` ```python embedding.py theme={null} import openai client = openai.OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1" ) def generate_embeddings(): response = client.embeddings.create( model="morph-embedding-v4", input="function calculateSum(a, b) { return a + b; }" ) return response.data[0].embedding ``` ## Model Selection We recommend using `morph-embedding-v4` for the best performance on code retrieval tasks. This model offers: * **State-of-the-Art Performance**: Achieves SoTA results across all coding benchmarks for accuracy:speed ratio * **1536 Dimensions**: Optimal dimensionality for rich semantic representation while maintaining efficiency * **Unmatched Speed**: Fastest inference in the market - no embedding model comes close on accuracy:speed * **Enhanced Context**: Superior handling of longer code snippets and complex codebases ## Input Format The request accepts the following parameters: | Parameter | Type | Required | Description | | ----------------- | --------------- | -------- | ------------------------------------------------------------------------------------------------------ | | `model` | string | Yes | The model ID to use for embedding generation. Use `morph-embedding-v4` (latest). | | `input` | string or array | Yes | The text to generate embeddings for. Can be a string or an array of strings. | | `encoding_format` | string | No | The format in which the embeddings are returned. Options are `float` and `base64`. Default is `float`. | ## Batch Processing Example ```python theme={null} from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1" ) # Example with batch inputs code_snippets = [ "function add(a, b) { return a + b; }", "class User { constructor(name) { this.name = name; } }", "import pandas as pd\ndf = pd.read_csv('data.csv')" ] response = client.embeddings.create( model="morph-embedding-v4", input=code_snippets ) # Access embeddings for each input for i, embedding_data in enumerate(response.data): embedding = embedding_data.embedding print(f"Embedding for snippet {i+1}: {len(embedding)} dimensions") ``` ## Response Format ```json theme={null} { "object": "list", "data": [ { "object": "embedding", "embedding": [0.0023064255, -0.009327292, ...], "index": 0 } ], "model": "morph-embedding-v4", "usage": { "prompt_tokens": 8, "total_tokens": 8 } } ``` When multiple inputs are provided, the response includes embeddings for each input: ```json theme={null} { "object": "list", "data": [ { "object": "embedding", "embedding": [0.0023064255, -0.009327292, ...], "index": 0 }, { "object": "embedding", "embedding": [0.0103662554, -0.007650322, ...], "index": 1 }, { "object": "embedding", "embedding": [0.0183664255, -0.002327742, ...], "index": 2 } ], "model": "morph-embedding-v4", "usage": { "prompt_tokens": 24, "total_tokens": 24 } } ``` ## Usage with Vector Databases Embeddings can be stored in vector databases for efficient similarity searching: ```python theme={null} # Example with Pinecone import pinecone from openai import OpenAI # Initialize clients openai_client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1" ) pinecone.init(api_key="your-pinecone-api-key", environment="your-environment") index = pinecone.Index("code-embeddings") # Generate embedding for a code snippet code_snippet = "def calculate_factorial(n):\n if n == 0:\n return 1\n else:\n return n * calculate_factorial(n-1)" response = openai_client.embeddings.create( model="morph-embedding-v4", input=code_snippet ) embedding = response.data[0].embedding # Store in Pinecone index.upsert([ ("snippet-1", embedding, {"snippet": code_snippet}) ]) # Search for similar code results = index.query( vector=embedding, top_k=5, include_metadata=True ) ``` # Enterprise Apply Source: https://docs.morphllm.com/api-reference/endpoint/enterprise POST /v1/chat/completions Enterprise Apply API with custom model configurations **πŸ”’ CONFIDENTIAL - INTERNAL USE ONLY** This page contains proprietary enterprise API documentation and is linked to your account. Do not share any information mentioned here with anyone external to your company. This documentation is for internal development and integration purposes only. # Quickstart Switch to instruction-guided editing with 98% accuracy in 3 steps. ## Prerequisites * Enterprise API key from your Morph account * Access to `https://api.morphllm.com/v1/` | Model | Speed | Accuracy | Input Limit | Output Limit | | -------------- | ------------------ | -------- | -------------- | -------------- | | morph-v3-fast | 10,500+ tok/sec | **96%** | **16k tokens** | **16k tokens** | | morph-v3-large | 5000+ tok/sec | **98%** | **16k tokens** | **16k tokens** | | auto | 5000-10,500tok/sec | **98%** | **16k tokens** | **16k tokens** | ## 1. Configure Your Edit Tool Set up your AI agent to generate the proper instructions guided format for the highest accuracy editing. **Edit File Tool Description:** ````xml theme={null} Use this tool to make an edit to an existing file. This will be read by a less intelligent model, which will quickly apply the edit. You should make it clear what the edit is, while also minimizing the unchanged code you write. When writing the edit, you should specify each edit in sequence, with the special comment // ... existing code ... to represent unchanged code in between edited lines. For example: // ... existing code ... FIRST_EDIT // ... existing code ... SECOND_EDIT // ... existing code ... THIRD_EDIT // ... existing code ... You should still bias towards repeating as few lines of the original file as possible to convey the change. But, each edit should contain minimally sufficient context of unchanged lines around the code you're editing to resolve ambiguity. DO NOT omit spans of pre-existing code (or comments) without using the // ... existing code ... comment to indicate its absence. If you omit the existing code comment, the model may inadvertently delete these lines. If you plan on deleting a section, you must provide context before and after to delete it. If the initial code is ```code \n Block 1 \n Block 2 \n Block 3 \n code```, and you want to remove Block 2, you would output ```// ... existing code ... \n Block 1 \n Block 3 \n // ... existing code ...```. Make sure it is clear what the edit should be, and where it should be applied. ALWAYS make all edits to a file in a single edit_file instead of multiple edit_file calls to the same file. The apply model can handle many distinct edits at once. ```` **Parameters:** * `target_filepath` (string, required): The path of the target file to modify * `instructions` (string, required): A single sentence written in the first person describing what you're changing. Used to help disambiguate uncertainty in the edit. * `code_edit` (string, required): Specify ONLY the precise lines of code that you wish to edit. Use `// ... existing code ...` for unchanged sections. **Tool Definition:** ````json theme={null} { "name": "edit_file", "description": "Use this tool to make an edit to an existing file.\n\nThis will be read by a less intelligent model, which will quickly apply the edit. You should make it clear what the edit is, while also minimizing the unchanged code you write.\nWhen writing the edit, you should specify each edit in sequence, with the special comment // ... existing code ... to represent unchanged code in between edited lines.\n\nFor example:\n\n// ... existing code ...\nFIRST_EDIT\n// ... existing code ...\nSECOND_EDIT\n// ... existing code ...\nTHIRD_EDIT\n// ... existing code ...\n\nYou should still bias towards repeating as few lines of the original file as possible to convey the change.\nBut, each edit should contain minimally sufficient context of unchanged lines around the code you're editing to resolve ambiguity.\nDO NOT omit spans of pre-existing code (or comments) without using the // ... existing code ... comment to indicate its absence. If you omit the existing code comment, the model may inadvertently delete these lines.\nIf you plan on deleting a section, you must provide context before and after to delete it. If the initial code is ```code \\n Block 1 \\n Block 2 \\n Block 3 \\n code```, and you want to remove Block 2, you would output ```// ... existing code ... \\n Block 1 \\n Block 3 \\n // ... existing code ...```.\nMake sure it is clear what the edit should be, and where it should be applied.\nALWAYS make all edits to a file in a single edit_file instead of multiple edit_file calls to the same file. The apply model can handle many distinct edits at once.", "parameters": { "properties": { "target_filepath": { "type": "string", "description": "Path of the target file to modify." }, "instructions": { "type": "string", "description": "A single sentence instruction describing what you are going to do for the sketched edit. This is used to assist the less intelligent model in applying the edit. Use the first person to describe what you are going to do. Use it to disambiguate uncertainty in the edit." }, "code_edit": { "type": "string", "description": "Specify ONLY the precise lines of code that you wish to edit. NEVER specify or write out unchanged code. Instead, represent all unchanged code using the comment of the language you're editing in - example: // ... existing code ..." } }, "required": ["target_filepath", "instructions", "code_edit"] } } ```` The `instructions` field should be generated by your AI model, not user input. Follow the tool description above nearly verbatim - terminology like "use it to disambiguate uncertainty in the edit" should be used. Example: "I am adding error handling to the user authentication function" ## 2. Send to Morph Enterprise API ```typescript enterprise_apply.ts theme={null} import { OpenAI } from 'openai'; const client = new OpenAI({ apiKey: 'your-enterprise-api-key', baseURL: 'https://api.morphllm.com/v1' }); const testOriginalCode = ` const a = 1 const b = 2 function add(a, b) { return a + b } function subtract(a, b) { return a - b } const authenticateUser () => { return "Authenticated" } `; // Test data - your agent should generate these const testInstruction = "I will add the real user authentication function and remove the old authentication method"; const testUpdateSnippet = ` // ... existing code ... const authenticateUser = (email, password) => { const result = await verifyUser(email, password) if (result) { return "Authenticated" } else { return "Unauthenticated" } } `; async function applyEnterpriseEdit( instruction: string, originalCode: string, updateSnippet: string ): Promise { const response = await client.chat.completions.create({ model: "morph-v3-fast", messages: [ { role: "user", content: `${instruction}\n${originalCode}\n${updateSnippet}` } ] }); return response.choices[0].message.content || ''; } // Example usage async function main() { try { const finalCode = await applyEnterpriseEdit( testInstruction, testOriginalCode, testUpdateSnippet ); console.log("Final merged code:"); console.log(finalCode); } catch (error) { console.error("Error applying edit:", error); } } // Run the example main(); ``` ```python enterprise_apply.py theme={null} import openai import asyncio client = openai.OpenAI( api_key="your-enterprise-api-key", base_url="https://api.morphllm.com/v1" ) test_original_code = """ const a = 1 const b = 2 def add(a, b): return a + b } def subtract(a, b): return a - b } def authenticateUser (): return "Authenticated" } """ # Test data - your agent should generate these test_instruction = "I will add the real user authentication function and remove the old authentication method" # This is the instruction that your agent should generate test_update_snippet = """ def authenticateUser (email, password) => { # ... existing code ... result = await verifyUser(email, password) if (result) { return "Authenticated" } else { return "Unauthenticated" } } """ def apply_enterprise_edit(instruction: str, original_code: str, update_snippet: str): """Apply an enterprise edit using Morph's instruction-guided editing.""" response = client.chat.completions.create( model="morph-v3-fast", messages=[ { "role": "user", "content": f"{instruction}\n{original_code}\n{update_snippet}" } ] ) return response.choices[0].message.content # Example usage if __name__ == "__main__": final_code = apply_enterprise_edit( test_instruction, test_original_code, test_update_snippet ) print("Final merged code:") print(final_code) ``` ## 3. Handle the Response Extract the merged code from the enterprise API response. **Response Format:** ```json theme={null} final_code = response.choices[0].message.content ``` **Extract the Final Code:** ```typescript extract_code.ts theme={null} const finalCode = response.choices[0].message.content; // Write to file or return to your application await fs.writeFile(targetFile, finalCode); ``` ```python extract_code.py theme={null} final_code = response.choices[0].message.content # Write to file or return to your application with open(target_file, 'w') as f: f.write(final_code) ``` *** ## Enterprise Features Instruction-guided editing achieves 98% accuracy on complex code changes Handle entire large files, complete modules, and complex codebases Generate complete implementations, full refactors, and comprehensive updates **Migration from Standard API:** Enterprise API requires an `` field but maintains backward compatibility with existing `` patterns. # Report API Source: https://docs.morphllm.com/api-reference/endpoint/report POST /api/report Report failed or problematic completions to improve Morph model quality ## Overview Report failed or problematic completions to help improve Morph's quality. This endpoint allows you to flag completions that produced incorrect, malformed, or problematic code so our team can investigate and improve the models. **When to use this endpoint:** * Generated code has syntax errors * Applied changes broke existing functionality * Model output doesn't match the intended instruction * Generated code produces runtime errors or exceptions * Code quality issues (security vulnerabilities, bad practices) The completion ID can be found in the response headers (`x-completion-id`) or server logs from your original apply request. ## Request Body The completion ID from the original request (found in response headers or logs) Description of what went wrong (Error message, traceback, etc.) The original user instruction that led to the problematic completion. This helps provide context for debugging and improving the model. Maximum 2000 characters. ## Response Whether the report was successfully recorded Confirmation message Internal ID of the reported request ISO timestamp when the report was recorded ## Error Codes | Status | Description | | ------ | ---------------------------- | | `200` | Report successfully recorded | | `400` | Invalid request parameters | | `401` | Invalid or missing API key | | `404` | Completion ID not found | | `409` | Request already reported | ## Examples ### cURL ```bash theme={null} curl -X POST "https://morphllm.com/api/report" \ -H "Authorization: Bearer your-api-key" \ -H "Content-Type: application/json" \ -d '{ "completion_id": "chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d", "failure_reason": "Generated code had syntax errors: SyntaxError: Unexpected token in JSON", "user_query": "Add error handling to the user login function" }' ``` ### JavaScript (fetch) ```javascript theme={null} const reportFailure = async (completionId, failureReason, userQuery = null) => { const payload = { completion_id: completionId, failure_reason: failureReason, }; // Include user_query only if provided if (userQuery) { payload.user_query = userQuery; } const response = await fetch('https://morphllm.com/api/report', { method: 'POST', headers: { 'Authorization': 'Bearer your-api-key', 'Content-Type': 'application/json', }, body: JSON.stringify(payload), }); if (!response.ok) { throw new Error(`HTTP error! status: ${response.status}`); } return await response.json(); }; // Usage with user query try { const result = await reportFailure( 'chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d', 'Generated code produces runtime error: TypeError: Cannot read property', 'Add validation to user input fields' ); console.log('Report submitted:', result); } catch (error) { console.error('Failed to submit report:', error); } // Usage without user query try { const result = await reportFailure( 'chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d', 'Generated code produces runtime error: TypeError: Cannot read property' ); console.log('Report submitted:', result); } catch (error) { console.error('Failed to submit report:', error); } ``` ### Python (requests) ```python theme={null} import requests import json def report_failure(completion_id, failure_reason, api_key, user_query=None): url = "https://morphllm.com/api/report" headers = { "Authorization": f"Bearer {api_key}", "Content-Type": "application/json" } payload = { "completion_id": completion_id, "failure_reason": failure_reason } # Include user_query only if provided if user_query: payload["user_query"] = user_query try: response = requests.post(url, headers=headers, json=payload) response.raise_for_status() # Raises an HTTPError for bad responses return response.json() except requests.exceptions.RequestException as e: print(f"Error submitting report: {e}") return None # Usage with user query api_key = "your-api-key" completion_id = "chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d" failure_reason = """ Traceback (most recent call last): File "generated_code.py", line 10, in result = process_data(invalid_input) File "generated_code.py", line 5, in process_data return data.split('.') AttributeError: 'NoneType' object has no attribute 'split' """ user_query = "Refactor the data processing function to handle null values" result = report_failure(completion_id, failure_reason, api_key, user_query) if result: print(f"Report submitted successfully: {result}") # Usage without user query result = report_failure(completion_id, failure_reason, api_key) if result: print(f"Report submitted successfully: {result}") ``` ### Node.js (axios) ```javascript theme={null} const axios = require('axios'); async function reportFailure(completionId, failureReason, apiKey, userQuery = null) { try { const payload = { completion_id: completionId, failure_reason: failureReason, }; // Include user_query only if provided if (userQuery) { payload.user_query = userQuery; } const response = await axios.post('https://morphllm.com/api/report', payload, { headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json', }, }); return response.data; } catch (error) { if (error.response) { // Server responded with error status console.error('Server error:', error.response.data); throw new Error(`Server error: ${error.response.status} - ${error.response.data.error?.message}`); } else if (error.request) { // Request was made but no response received console.error('Network error:', error.request); throw new Error('Network error: No response received'); } else { // Something else happened console.error('Request error:', error.message); throw new Error(`Request error: ${error.message}`); } } } // Usage with user query (async () => { try { const result = await reportFailure( 'chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d', 'Generated code fails unit tests: Expected 5 but got undefined', 'your-api-key', 'Add unit tests for the calculate function' ); console.log('Success:', result.message); console.log('Report ID:', result.data?.request_log_id); console.log('Reported at:', result.data?.reported_at); } catch (error) { console.error('Failed to report:', error.message); } })(); // Usage without user query (async () => { try { const result = await reportFailure( 'chatcmpl-9d9e2fc21c094f4eacbcee0009f2f12d', 'Generated code fails unit tests: Expected 5 but got undefined', 'your-api-key' ); console.log('Success:', result.message); console.log('Report ID:', result.data?.request_log_id); console.log('Reported at:', result.data?.reported_at); } catch (error) { console.error('Failed to report:', error.message); } })(); ``` ### Response Example Successful response (200 OK): ```json theme={null} { "success": true, "message": "Report successfully recorded", "data": { "request_log_id": "req_123456789", "reported_at": "2024-01-15T10:30:00Z" } } ``` Error response (400 Bad Request): ```json theme={null} { "error": { "message": "Missing required parameter: completion_id", "type": "invalid_request_error", "code": "missing_parameter" } } ``` # Rerank API Source: https://docs.morphllm.com/api-reference/endpoint/rerank POST /v1/rerank Reorder search results by relevance using morph-rerank-v4 **Planned for deprecation.** The Rerank API will be removed in a future release. For code search, use [WarpGrep](/api-reference/endpoint/warpgrep) instead. WarpGrep is a search agent that handles retrieval, ranking, and file reading in one call, replacing the need to manage embeddings, vector databases, and reranking pipelines yourself. ## Overview Morph's Rerank API improves search quality by reordering candidate results based on their relevance to a query. Our latest `morph-rerank-v4` model achieves state-of-the-art performance across all coding benchmarks for accuracy:speed ratio - no rerank model comes close. Unlike the Apply and Embedding endpoints, the Rerank API uses a custom endpoint specifically designed for reranking tasks. ## API Endpoint ``` POST https://api.morphllm.com/v1/rerank ``` ## Model Versions The latest version is `morph-rerank-v4` with state-of-the-art performance across all code benchmarks for its speed-accuracy ratio. ## Example Request ```typescript theme={null} async function rerankResults( query: string, documents: string[], topN: number = 5 ) { const response = await fetch("https://api.morphllm.com/v1/rerank", { method: "POST", headers: { Authorization: "Bearer YOUR_API_KEY", "Content-Type": "application/json", }, body: JSON.stringify({ model: "morph-rerank-v4", query: query, documents: documents, top_n: topN, }), }); return await response.json(); } ``` Note that the `top_n` request parameter is optional and will default to the length of the `documents` field. Result documents will be sorted by relevance, and the `index` property can be used to determine original order. ## Input Format The request accepts the following parameters: | Parameter | Type | Required | Description | | --------------- | ------- | -------- | --------------------------------------------------------------------------------------------------------------------- | | `model` | string | Yes | The model ID to use for reranking. Use `morph-rerank-v4`. | | `query` | string | Yes | The search query to compare documents against. | | `documents` | array | No\* | An array of document strings to be reranked. Required if `embedding_ids` is not provided. | | `embedding_ids` | array | No\* | An array of embedding IDs to rerank. Required if `documents` is not provided. Remote content storage must be enabled. | | `top_n` | integer | No | Number of top results to return. Default is all documents. | \* Either `documents` or `embedding_ids` must be provided. ## Using Document Content ```python theme={null} import requests def rerank_results(query, documents, top_n=5): response = requests.post( "https://api.morphllm.com/v1/rerank", headers={ "Authorization": f"Bearer YOUR_API_KEY", "Content-Type": "application/json" }, json={ "model": "morph-rerank-v4", "query": query, "documents": documents, "top_n": top_n } ) return response.json() # Example usage with code documentation query = "How to implement JWT authentication in Express" documents = [ """const jwt = require('jsonwebtoken'); const express = require('express'); function authenticateToken(req, res, next) { const authHeader = req.headers['authorization']; const token = authHeader && authHeader.split(' ')[1]; if (token == null) return res.sendStatus(401); jwt.verify(token, process.env.ACCESS_TOKEN_SECRET, (err, user) => { if (err) return res.sendStatus(403); req.user = user; next(); }); }""", """const express = require('express'); const app = express(); const port = 3000; app.use(express.json()); app.get('/', (req, res) => { res.send('Hello World!'); }); app.listen(port, () => { console.log(`App listening at http://localhost:${port}`); });""", """const jwt = require('jsonwebtoken'); const user = { id: 123, username: 'john_doe' }; const accessToken = jwt.sign(user, process.env.ACCESS_TOKEN_SECRET, { expiresIn: '15m' }); const refreshToken = jwt.sign(user, process.env.REFRESH_TOKEN_SECRET); console.log('Access Token:', accessToken);""", """const express = require('express'); const router = express.Router(); router.get('/users', (req, res) => { res.json([{ id: 1, name: 'John' }, { id: 2, name: 'Jane' }]); }); router.post('/users', (req, res) => { const { name } = req.body; res.json({ id: 3, name }); }); module.exports = router;""", """const passport = require('passport'); const GoogleStrategy = require('passport-google-oauth20').Strategy; passport.use(new GoogleStrategy({ clientID: process.env.GOOGLE_CLIENT_ID, clientSecret: process.env.GOOGLE_CLIENT_SECRET, callbackURL: "/auth/google/callback" }, (accessToken, refreshToken, profile, done) => { return done(null, profile); }));""", """const express = require('express'); const passport = require('passport'); const JwtStrategy = require('passport-jwt').Strategy; const ExtractJwt = require('passport-jwt').ExtractJwt; passport.use(new JwtStrategy({ jwtFromRequest: ExtractJwt.fromAuthHeaderAsBearerToken(), secretOrKey: process.env.JWT_SECRET }, (payload, done) => { User.findById(payload.sub, (err, user) => { if (err) return done(err, false); if (user) return done(null, user); return done(null, false); }); }));""" ] results = rerank_results(query, documents, top_n=3) print(results) ``` ## Using Embedding IDs When you have previously generated embeddings and enabled remote content storage, you can rerank using embedding IDs: ```javascript theme={null} async function rerankWithEmbeddingIds(query, embeddingIds, topN = 5) { const response = await fetch("https://api.morphllm.com/v1/rerank", { method: "POST", headers: { Authorization: "Bearer YOUR_API_KEY", "Content-Type": "application/json", }, body: JSON.stringify({ model: "morph-rerank-v4", // Use the latest model version query: query, embedding_ids: embeddingIds, top_n: topN, }), }); return await response.json(); } // Example with embedding IDs const query = "React state management patterns"; const embeddingIds = [ "emb_123456789", "emb_987654321", "emb_456789123", "emb_789123456", "emb_321654987", ]; rerankWithEmbeddingIds(query, embeddingIds, 3).then((results) => console.log(results) ); ``` ## cURL Examples ### With Document Content ```bash theme={null} curl --request POST \ --url https://api.morphllm.com/v1/rerank \ --header 'Authorization: Bearer YOUR_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "morph-rerank-v4", "query": "How to implement JWT authentication in Express", "documents": [ "const jwt = require(\"jsonwebtoken\");\nconst express = require(\"express\");\n\nfunction authenticateToken(req, res, next) {\n const authHeader = req.headers[\"authorization\"];\n const token = authHeader && authHeader.split(\" \")[1];\n \n if (token == null) return res.sendStatus(401);\n \n jwt.verify(token, process.env.ACCESS_TOKEN_SECRET, (err, user) => {\n if (err) return res.sendStatus(403);\n req.user = user;\n next();\n });\n}", "const express = require(\"express\");\nconst app = express();\nconst port = 3000;\n\napp.use(express.json());\n\napp.get(\"/\", (req, res) => {\n res.send(\"Hello World!\");\n});\n\napp.listen(port, () => {\n console.log(`App listening at http://localhost:${port}`);\n});", "const jwt = require(\"jsonwebtoken\");\n\nconst user = { id: 123, username: \"john_doe\" };\nconst accessToken = jwt.sign(user, process.env.ACCESS_TOKEN_SECRET, { expiresIn: \"15m\" });\nconst refreshToken = jwt.sign(user, process.env.REFRESH_TOKEN_SECRET);\n\nconsole.log(\"Access Token:\", accessToken);", "const express = require(\"express\");\nconst router = express.Router();\n\nrouter.get(\"/users\", (req, res) => {\n res.json([{ id: 1, name: \"John\" }, { id: 2, name: \"Jane\" }]);\n});\n\nrouter.post(\"/users\", (req, res) => {\n const { name } = req.body;\n res.json({ id: 3, name });\n});\n\nmodule.exports = router;", "const passport = require(\"passport\");\nconst GoogleStrategy = require(\"passport-google-oauth20\").Strategy;\n\npassport.use(new GoogleStrategy({\n clientID: process.env.GOOGLE_CLIENT_ID,\n clientSecret: process.env.GOOGLE_CLIENT_SECRET,\n callbackURL: \"/auth/google/callback\"\n}, (accessToken, refreshToken, profile, done) => {\n return done(null, profile);\n}));", "const express = require(\"express\");\nconst passport = require(\"passport\");\nconst JwtStrategy = require(\"passport-jwt\").Strategy;\nconst ExtractJwt = require(\"passport-jwt\").ExtractJwt;\n\npassport.use(new JwtStrategy({\n jwtFromRequest: ExtractJwt.fromAuthHeaderAsBearerToken(),\n secretOrKey: process.env.JWT_SECRET\n}, (payload, done) => {\n User.findById(payload.sub, (err, user) => {\n if (err) return done(err, false);\n if (user) return done(null, user);\n return done(null, false);\n });\n}));" ], "top_n": 3 }' ``` ### With Embedding IDs ```bash theme={null} curl --request POST \ --url https://api.morphllm.com/v1/rerank \ --header 'Authorization: Bearer YOUR_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "morph-rerank-v4", "query": "React state management patterns", "embedding_ids": [ "emb_123456789", "emb_987654321", "emb_456789123", "emb_789123456", "emb_321654987" ], "top_n": 3 }' ``` ## Response Format ```json theme={null} { "id": "rerank-26b29083d49a4c1e82032a95549a8633", "model": "morph-rerank-v4", "usage": { "total_tokens": 21 }, "results": [ { "index": 0, "document": { "text": "const jwt = require('jsonwebtoken');\nconst express = require('express');\n\nfunction authenticateToken(req, res, next) {\n const authHeader = req.headers['authorization'];\n const token = authHeader && authHeader.split(' ')[1];\n \n if (token == null) return res.sendStatus(401);\n \n jwt.verify(token, process.env.ACCESS_TOKEN_SECRET, (err, user) => {\n if (err) return res.sendStatus(403);\n req.user = user;\n next();\n });\n}" }, "relevance_score": 0.92 }, { "index": 5, "document": { "text": "const express = require('express');\nconst passport = require('passport');\nconst JwtStrategy = require('passport-jwt').Strategy;\nconst ExtractJwt = require('passport-jwt').ExtractJwt;\n\npassport.use(new JwtStrategy({\n jwtFromRequest: ExtractJwt.fromAuthHeaderAsBearerToken(),\n secretOrKey: process.env.JWT_SECRET\n}, (payload, done) => {\n User.findById(payload.sub, (err, user) => {\n if (err) return done(err, false);\n if (user) return done(null, user);\n return done(null, false);\n });\n}));" }, "relevance_score": 0.87 }, { "index": 2, "document": { "text": "const jwt = require('jsonwebtoken');\n\nconst user = { id: 123, username: 'john_doe' };\nconst accessToken = jwt.sign(user, process.env.ACCESS_TOKEN_SECRET, { expiresIn: '15m' });\nconst refreshToken = jwt.sign(user, process.env.REFRESH_TOKEN_SECRET);\n\nconsole.log('Access Token:', accessToken);" }, "relevance_score": 0.75 } ] } ``` When using embedding IDs, the response will include the document content if available ## Remote Content Storage To use embedding IDs for reranking, you must enable remote content storage in your account settings. This allows Morph to retrieve the content associated with each embedding ID for reranking purposes. Without remote content storage enabled, you'll need to pass in the document content directly. Benefits of using embedding IDs: * Reduced payload size for large document collections * Improved security as content is stored in your account's secure storage * Ability to rerank content that was previously embedded ## Integration with Search Systems The Rerank API is typically used as a second-pass ranking system in a multi-stage retrieval pipeline. For best code search performance, we recommend using [WarpGrep](/sdk/components/warp-grep/index) β€” our intelligent code search tool that combines fast retrieval with automatic reranking. WarpGrep handles the entire search pipeline for you, delivering 20x faster results than stock grepping. ```javascript theme={null} import { OpenAI } from 'openai'; import fetch from 'node-fetch'; // Initialize OpenAI client for embeddings const openaiClient = new OpenAI({ apiKey: 'your-morph-api-key', baseURL: 'https://api.morphllm.com/v1' }); // Example search pipeline async function semanticSearch(query, codebase) { // 1. Generate embedding for the query const embeddingResponse = await openaiClient.embeddings.create({ model: "morph-embedding-v4", input: query }); const queryEmbedding = embeddingResponse.data[0].embedding; // 2. Retrieve initial candidates using vector similarity // (Simplified example - in practice, you would use a vector database) const candidateDocuments = retrieveSimilarDocuments(queryEmbedding, codebase); // 3. Rerank candidates for more accurate results // Example search pipeline with embedding IDs async function semanticSearchWithEmbeddingIds(query, embeddingIds) { // Rerank candidates for more accurate results const rerankedResults = await fetch('https://api.morphllm.com/v1/rerank', { method: 'POST', headers: { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'morph-rerank-v4', query: query, embedding_ids: embeddingIds, top_n: 5 }) }).then(res => res.json()); return rerankedResults; } // Helper function to simulate vector similarity search function retrieveSimilarDocuments(queryEmbedding, codebase) { // In practice, this would be a call to a vector database return codebase.slice(0, 20); // Return first 20 documents as candidates } ``` This two-stage approach combines the efficiency of initial retrieval methods with the accuracy of deep neural reranking models. # WarpGrep API Source: https://docs.morphllm.com/api-reference/endpoint/warpgrep POST /v1/chat/completions Semantic code search subagent that explores repositories in ~6 seconds ## Overview WarpGrep is a code search agent that uses a multi-turn conversation to explore repositories. The model has its tools (`grep_search`, `read`, `list_directory`, `glob`, `finish`) **built in** β€” you do not need to pass a `tools` array in your requests. ## Model Use `morph-warp-grep-v2.1` as the model identifier. ## Message Format WarpGrep uses a structured format in the initial user message with **flat absolute paths**: ```xml theme={null} /home/user/myproject /home/user/myproject/README.md /home/user/myproject/package.json /home/user/myproject/src /home/user/myproject/src/auth /home/user/myproject/src/auth/login.py /home/user/myproject/src/db /home/user/myproject/src/utils /home/user/myproject/tests /home/user/myproject/config.py /home/user/myproject/main.py Find where user authentication is implemented ``` ### Format Components * **``**: Flat list of absolute paths β€” repo root first, then all files/directories to depth 2. No indentation, no tree characters, no trailing `/` on directories. * **``**: Natural language description of what code to find ## Example Request ```typescript TypeScript theme={null} import OpenAI from "openai"; const openai = new OpenAI({ apiKey: "YOUR_API_KEY", baseURL: "https://api.morphllm.com/v1", }); const repoRoot = "/home/user/myapp"; const repoStructure = `${repoRoot} ${repoRoot}/src ${repoRoot}/src/auth ${repoRoot}/src/api ${repoRoot}/src/models ${repoRoot}/tests ${repoRoot}/package.json`; const searchQuery = "Find where JWT tokens are validated"; const response = await openai.chat.completions.create({ model: "morph-warp-grep-v2.1", messages: [ { role: "user", content: `\n${repoStructure}\n\n\n\n${searchQuery}\n` } ], temperature: 0.0, max_tokens: 2048 }); // Response has tool_calls β€” execute locally and continue the loop const toolCalls = response.choices[0].message.tool_calls; ``` ```python Python theme={null} from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.morphllm.com/v1" ) repo_root = "/home/user/myapp" repo_structure = f"""{repo_root} {repo_root}/src {repo_root}/src/auth {repo_root}/src/api {repo_root}/src/models {repo_root}/tests {repo_root}/package.json""" search_query = "Find where JWT tokens are validated" response = client.chat.completions.create( model="morph-warp-grep-v2.1", messages=[ { "role": "user", "content": f"\n{repo_structure}\n\n\n\n{search_query}\n" } ], temperature=0.0, max_tokens=2048, ) # Response has tool_calls β€” execute locally and continue the loop tool_calls = response.choices[0].message.tool_calls ``` ```bash cURL theme={null} curl -X POST "https://api.morphllm.com/v1/chat/completions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "morph-warp-grep-v2.1", "messages": [ { "role": "user", "content": "\n/home/user/myapp\n/home/user/myapp/src\n/home/user/myapp/src/auth\n\n\n\nFind where JWT tokens are validated\n" } ], "temperature": 0.0, "max_tokens": 2048 }' ``` See [Direct API Access](/sdk/components/warp-grep/direct) for the full protocol details including tool execution and multi-turn flow. ## Multi-Turn Conversation WarpGrep uses built-in tool calling (up to 6 turns). The agent will: 1. **Turn 1**: Analyze your search query and call tools (`grep_search`, `list_directory`, `glob`) to explore 2. **Turns 2-5**: Refine search based on results, read specific files 3. **Final turn**: Call `finish` with code locations You execute tool calls locally and return results as `{role: "tool", tool_call_id: "...", content: "..."}` messages. ## Request Parameters | Parameter | Type | Required | Description | | ------------- | ------ | -------- | -------------------------------------------- | | `model` | string | Yes | Must be `morph-warp-grep-v2.1` | | `messages` | array | Yes | Array of conversation messages | | `temperature` | number | No | Recommended: `0.0` for deterministic results | | `max_tokens` | number | No | Recommended: `2048` | Tools are built into the model β€” you do **not** need to pass a `tools` parameter. The model will return `tool_calls` automatically. ## Response Format The agent responds with structured `tool_calls`: ```json theme={null} { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1234567890, "model": "morph-warp-grep-v2.1", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ {"id": "chatcmpl-tool-abc123", "type": "function", "function": {"name": "grep_search", "arguments": "{\"pattern\": \"jwt|JWT\"}"}}, {"id": "chatcmpl-tool-def456", "type": "function", "function": {"name": "list_directory", "arguments": "{\"command\": \"ls src/auth\"}"}} ] }, "finish_reason": "tool_calls" }], "usage": { "prompt_tokens": 1180, "total_tokens": 1245, "completion_tokens": 65 } } ``` After you execute tools and return results, the agent continues until it calls `finish`. ## Available Tools WarpGrep uses five tools: * **`grep_search`**: Search for regex patterns across files * **`read`**: Read file contents with optional line ranges * **`list_directory`**: Explore directory structure * **`glob`**: Find files by name/extension pattern (sorted by mtime) * **`finish`**: Submit final answer with code locations See the [Direct API Guide](/sdk/components/warp-grep/direct) for complete tool specifications. ## SDK Integration For easier integration, use the WarpGrep SDK components: * **[TypeScript Tool](/sdk/components/warp-grep/tool)**: Drop-in tool for AI SDKs * **[Python Guide](/guides/warp-grep-python)**: Complete Python implementation ## Error Codes HTTP Status Description 200 Success - chat completion response with tool\_calls 400 Bad request - malformed request or parameters 401 Authentication error - invalid API key
Build your own WarpGrep harness Complete Python guide with examples # Self-Hosting Source: https://docs.morphllm.com/api-reference/self-hosting Run Morph models in your own environment with self-hosting options ## Overview For organizations with strict security requirements, Morph offers self-hosting options that allow you to run our code transformation models in your own environment. ## Benefits of Self-Hosting * **Zero data retention**: Your code never leaves your environment * **No usage metering**: Predictable costs with no per-request billing * **Full control**: Deploy behind your firewall with your own security controls * **Same performance**: The exact same speed and accuracy as our cloud offering ## Deployment Options Morph can be deployed in containers using Docker and Kubernetes, or directly in your private cloud infrastructure (AWS, GCP, Azure). ## Complete Suite for Coding Agents Self-hosted Morph includes: * **Fast Apply Model**: Transform code with unmatched speed and precision * **Autocomplete Model**: Lightning-fast code completion * **Embedding Model**: High-quality vector representations for semantic search * **Reranker Model**: Enhance search quality with precise context ranking ## Get Started with Self-Hosting For information about self-hosting options and enterprise licensing, please contact us at [info@morphllm.com](mailto:info@morphllm.com). # Authentication Source: https://docs.morphllm.com/auth Learn how to authenticate with Morph API using Bearer tokens **Prerequisite**: You'll need an account on [Morph](https://morphllm.com/dashboard) to obtain an API key. ## Authentication All Morph API endpoints require authentication using Bearer tokens: ```bash theme={null} Authorization: Bearer your-morph-api-key ``` To get your API key: 1. Visit the [Morph dashboard](https://morphllm.com/api-keys) 2. Create an account or sign in 3. Navigate to your API keys section 4. Generate a new API key Keep your API key secure and never expose it in client-side code or public repositories. ## Base URL All Morph API endpoints use the following base URL: ```bash theme={null} https://api.morphllm.com/v1 ``` ## Test Your API Key Verify your setup with a simple test request: ```python Python theme={null} from openai import OpenAI client = OpenAI( api_key="your-morph-api-key", base_url="https://api.morphllm.com/v1" ) # Test the connection response = client.chat.completions.create( model="morph-v3-fast", messages=[{ "role": "user", "content": "def hello():\n print('Hello World')\ndef hello():\n print('Hello Morph!')" }] ) print(response.choices[0].message.content) ``` ```javascript JavaScript theme={null} import { OpenAI } from "openai"; const client = new OpenAI({ apiKey: "your-morph-api-key", baseURL: "https://api.morphllm.com/v1", }); // Test the connection const response = await client.chat.completions.create({ model: "morph-v3-fast", messages: [ { role: "user", content: "def hello():\n print('Hello World')\ndef hello():\n print('Hello Morph!')", }, ], }); console.log(response.choices[0].message.content); ``` ```bash cURL theme={null} curl --request POST \ --url https://api.morphllm.com/v1/chat/completions \ --header 'Authorization: Bearer your-morph-api-key' \ --header 'Content-Type: application/json' \ --data '{ "model": "morph-v3-fast", "messages": [{ "role": "user", "content": "def hello():\n print(\"Hello World\")\ndef hello():\n print(\"Hello Morph!\")" }] }' ``` If the test succeeds, you should see the updated code with "Hello Morph!" instead of "Hello World". ## Alternative Access Methods You can also access Morph through these platforms: Access Morph models through OpenRouter's unified API platform Use Morph with Model Context Protocol servers and Claude Desktop ## Next Steps Now that you've tested your API key, explore Morph's specialized models: Apply code changes with precision at 10,500 tokens per second and 98% accuracy Generate semantic embeddings optimized for code understanding and search Reorder search results by relevance with code-aware ranking algorithms For access to our latest models, self-hosting, or business inquiries, please contact us at [info@morphllm.com](mailto:info@morphllm.com). # Enterprise Solutions Source: https://docs.morphllm.com/enterprise Deploy Morph with enterprise-grade security, compliance, and support ## Enterprise Features Deploy on your infrastructure with full data control Dedicated instance with SOC2 compliance and SLAs Air-gapped deployment for maximum security 24/7 dedicated support with guaranteed response times ## Key Benefits * **Enhanced Models**: 44k input / 36k output context windows (coming soon) * **Security & Compliance**: SOC2 Type II certified, HIPAA compliant options * **Enterprise Support**: 24/7 dedicated support with SLA guarantees * **Flexible Deployment**: Multi-region, auto-scaling, custom endpoints ## Pricing Enterprise pricing is customized based on deployment type, usage volume, support level, and additional features. Get custom deployment options and enterprise features # Glossary Source: https://docs.morphllm.com/glossary Key terms and concepts used across Morph documentation ## Products **Fast Apply (Apply)** β€” File editing model that merges partial code updates into existing files at 10,500 tok/s with 98% accuracy. Uses the `//` message format. **WarpGrep** β€” Code search subagent that runs in a separate context window. Explores repositories using built-in tools (`grep_search`, `read`, `list_directory`, `glob`, `finish`). Returns matching code snippets in \~6 seconds. **Compact (Compactor)** β€” Context compression model that removes irrelevant lines from chat history and code at 33,000 tok/s. Every surviving line is byte-for-byte identical to the original input. **Router** β€” Prompt complexity classifier that returns a model recommendation in \~430ms. Does not generate completions itself. Routes to the optimal model (e.g., `claude-haiku` for simple, `claude-sonnet` for complex). ## Model IDs | Model | ID | Purpose | | ------------- | -------------------- | ------------------------------------------ | | Apply (fast) | `morph-v3-fast` | Default file editing, highest speed | | Apply (large) | `morph-v3-large` | Complex edits requiring more reasoning | | Apply (auto) | `auto` | Router selects fast vs large automatically | | WarpGrep | `morph-warp-grep-v1` | Codebase search (local and GitHub) | | Compact | `morph-compactor` | Context compression | | Embedding | `morph-embedding-v4` | Code and text embeddings | | Rerank | `morph-rerank-v4` | Search result reranking | | Router | `morph-routers` | Prompt complexity classification | ## Message Format Tags **``** β€” XML tag in Apply messages describing what the edit does. Including it raises accuracy from 92% to 98%. **``** β€” XML tag containing the original file content to be edited. **``** β€” XML tag containing the partial edit snippet with `// ... existing code ...` markers for unchanged regions. **`// ... existing code ...`** β€” Marker placed in `` snippets to indicate regions that should remain unchanged. Required for Apply to correctly merge partial edits. **``** β€” XML block in WarpGrep messages describing the repository directory layout. Required for local codebase search. **``** β€” XML tag wrapping sections of input that Compact should never remove, regardless of relevance scoring. ## API Concepts **Base URL** β€” `https://api.morphllm.com/v1`. All endpoints are OpenAI-compatible and work with any OpenAI SDK. **Bearer token** β€” Authentication method for all Morph API endpoints. Obtained from the [dashboard](https://morphllm.com/dashboard/api-keys). **`query` parameter (Compact)** β€” Tells Compact what information matters for the next LLM call. Without it, the model infers relevance from the last user message. **`code_context`** β€” Parameter in the `edit_file` tool definition containing the original file content to be edited. **`search_context`** β€” Parameter in the `codebase_search` tool definition containing the repository structure for WarpGrep queries. # Agent Tools (edit_file) Source: https://docs.morphllm.com/guides/agent-tools Build precise AI agents that edit code fast without full file rewrites using Morph's edit_file tool ## Essential Supporting Tools Always read files before editing to understand the structure: ```json theme={null} { "name": "read_file", "description": "Read the contents of a file to understand its structure before making edits", "parameters": { "properties": { "target_file": { "type": "string", "description": "The path of the file to read" }, "start_line_one_indexed": { "type": "integer", "description": "Start line number (1-indexed)" }, "end_line_one_indexed_inclusive": { "type": "integer", "description": "End line number (1-indexed, inclusive)" }, "explanation": { "type": "string", "description": "Why you're reading this file" } }, "required": ["target_file", "explanation"] } } ``` **Best practice:** Read the relevant sections first, then edit with proper context. Semantic search to locate relevant code: ```json theme={null} { "name": "codebase_search", "description": "Find snippets of code from the codebase most relevant to the search query", "parameters": { "properties": { "query": { "type": "string", "description": "The search query to find relevant code" }, "target_directories": { "type": "array", "items": {"type": "string"}, "description": "Optional: limit search scope to specific directories" }, "explanation": { "type": "string", "description": "Why you're searching for this" } }, "required": ["query", "explanation"] } } ``` **Best practice:** Search first to understand the codebase, then read specific files. When you need exact text or pattern matches: ```json theme={null} { "name": "grep_search", "description": "Fast text-based regex search that finds exact pattern matches within files", "parameters": { "properties": { "query": { "type": "string", "description": "The regex pattern to search for" }, "include_pattern": { "type": "string", "description": "File types to include (e.g. '*.ts')" }, "explanation": { "type": "string", "description": "Why you're searching for this pattern" } }, "required": ["query", "explanation"] } } ``` **Best practice:** Use for finding function names, imports, or specific strings. Navigate and understand the codebase structure: ```json theme={null} { "name": "list_dir", "description": "List the contents of a directory to understand project structure", "parameters": { "properties": { "relative_workspace_path": { "type": "string", "description": "Path to list contents of, relative to the workspace root" }, "explanation": { "type": "string", "description": "Why you're listing this directory" } }, "required": ["relative_workspace_path", "explanation"] } } ``` **Best practice:** Use to explore unknown codebases or find related files before editing. ## Agent Workflow Effective agents follow this pattern: 1. **πŸ” Search**: Find relevant code with `codebase_search` or `grep_search` 2. **πŸ“– Read**: Get context with `read_file` before editing 3. **✏️ Edit**: Make precise changes with `edit_file` 4. **βœ… Verify**: Read again to confirm changes worked ## Common Patterns **Delete a section in between:** ```javascript theme={null} // ... existing code ... function keepThis() { return "stay"; } function alsoKeepThis() { return "also stay"; } // ... existing code ... ``` **Add imports:** ```javascript theme={null} import { useState, useEffect } from "react"; import { calculateTax } from "./utils"; // New import // ... existing code ... ``` **Update configuration:** ```json theme={null} { "name": "my-app", "version": "2.0.0", "scripts": { "dev": "next dev", "build": "next build", "test": "jest" } } ``` **Add error handling:** ```javascript theme={null} // ... existing code ... function divide(a, b) { if (b === 0) { throw new Error("Cannot divide by zero"); } return a / b; } // ... existing code ... ``` **Update function parameters:** ```javascript theme={null} // ... existing code ... function authenticateUser(email, password) { const result = await verifyUser(email, password); if (result) { return "Authenticated"; } else { return "Unauthenticated"; } } // ... existing code ... ``` **Add new methods to a class:** ```javascript theme={null} // ... existing code ... class UserService { async getUser(id) { return await this.db.findUser(id); } async updateUser(id, data) { return await this.db.updateUser(id, data); } } // ... existing code ... ``` ## Error Handling Morph is trained to be robust to poor quality update snippets, but you should still follow these steps to ensure the best quality. When tools fail, follow these steps: 1. **Check file permissions**: Ensure the target file is writable 2. **Verify file path**: Confirm the file exists and path is correct 3. **Review syntax**: Check that your edit snippet follows the `// ... existing code ...` pattern 4. **Retry with context**: Read the file again and provide more context around your changes 5. **Simplify changes**: Break complex edits into smaller, focused changes **Common Error Patterns:** ```javascript theme={null} // ❌ Wrong - missing context function newFunction() { return "hello"; } // βœ… Correct - with context // ... existing code ... function newFunction() { return "hello"; } // ... existing code ... ``` ## Next Steps Ready to start building with Morph? Here's what to do next: Learn about the Apply API endpoints, models, and message formats for production use Step-by-step guide to configure your agent with the edit\_file tool and integrate with Morph's Fast Apply API For complex refactoring across multiple files, consider using multiple `edit_file` calls in sequence. For failed edits, read the file again and provide more context around your changes. # Vercel AI SDK Source: https://docs.morphllm.com/guides/ai-sdk Stream fast code edits with Morph using the Vercel AI SDK # Morph + Vercel AI SDK Stream code edits at 10,500+ tokens/second using the Vercel AI SDK with Morph's fast apply model. Use Vercel's AI Gateway for unified billing, rate limits, and failover across 100+ AI models. ## Setup ### Option 1: AI Gateway (Recommended) 1. Get an [AI Gateway API key](https://vercel.com/d?to=%2F%5Bteam%5D%2F%7E%2Fai%2Fapi-keys%3Futm_source%3Dai_sdk_code_generator_modal\&title=Get+an+AI+Gateway+API+Key) from Vercel 2. Add it to your environment variables as `OPENAI_API_KEY` 3. Install the AI SDK: ```bash theme={null} npm install ai@beta ``` ### Option 2: Direct API 1. Get a Morph API key from the [Morph dashboard](https://morphllm.com) 2. Add it to your environment variables as `MORPH_API_KEY` 3. Install the AI SDK: ```bash theme={null} npm install ai@beta ``` ## Implementation ```typescript AI Gateway theme={null} import { streamText } from 'ai' import { createOpenAI } from '@ai-sdk/openai' const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY!, baseURL: 'https://gateway.ai.vercel.com/v1', headers: { 'X-Vercel-AI-Provider': 'morph', }, }) export async function POST(req: Request) { const { editInstructions, originalCode, update } = await req.json() // Get the morph model through AI Gateway const model = openai('morph-v3-fast') // Call the language model with the prompt const result = streamText({ model, messages: [ { role: 'user', content: `${editInstructions}\n${originalCode}\n${update}` } ], topP: 1, }) // Respond with a streaming response return result.toAIStreamResponse() } ``` ```typescript Direct API theme={null} import { streamText } from 'ai' import { createOpenAICompatible } from '@ai-sdk/openai-compatible' const morph = createOpenAICompatible({ apiKey: "YOUR_API_KEY", name: 'morph', baseURL: 'https://api.morphllm.com/v1' }) export async function POST(req: Request) { const { editInstructions, originalCode, update } = await req.json() // Get a language model const model = morph('morph-v3-fast') // Call the language model with the prompt const result = streamText({ model.chat(), messages: [ { role: 'user', content: `${editInstructions}\n${originalCode}\n${update}` } ], topP: 1, }) // Respond with a streaming response return result.toAIStreamResponse() } ``` ```` ```typescript components/CodeEditor.tsx 'use client' import { useCompletion } from 'ai/react' import { useState } from 'react' export function CodeEditor() { const [originalCode, setOriginalCode] = useState('') const [editInstructions, setEditInstructions] = useState('') const { completion, isLoading, complete } = useCompletion({ api: '/api/morph', }) const handleApplyEdit = async () => { await complete('', { body: { originalCode, editInstructions }, }) } return (