Skip to main content
POST
/
v1
/
chat
/
completions
cURL
curl --request POST \
  --url https://api.morphllm.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data @- <<EOF
{
  "model": "morph-v3-fast",
  "messages": [
    {
      "role": "user",
      "content": "<instruction>I will add error handling</instruction>\n<code>function divide(a, b) {\n  return a / b;\n}</code>\n<update>function divide(a, b) {\n  if (b === 0) throw new Error('Division by zero');\n  return a / b;\n}</update>"
    }
  ],
  "stream": false,
  "max_tokens": 150,
  "temperature": 0
}
EOF
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "\ndef calculate_total(items):\n    total = 0\n    for item in items:\n        total += item.price\n    return total * 1.1  # Add 10% tax\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 32,
    "total_tokens": 57
  }
}

Overview

Warp Grep is an agentic code search model that finds relevant code across your codebase in 2-4 turns. It uses XML-based tool calls (grep, read, list_directory, finish) to navigate and locate code, returning precise file locations with line ranges.
For complete implementation details including tool executors and the agent loop, see the Direct API Access guide.

Model

ModelDescription
morph-warp-grep-v1Agentic code search with 4-turn max, 8 parallel tool calls per turn

How It Works

Warp Grep runs a multi-turn conversation where the model outputs XML tool calls, you execute them locally, and return results:

Message Format

Initial Request

The first user message contains:
  1. Repository structure — pre-run list_directory at depth 2
  2. Search query — what the agent needs to find
<repo_structure>
myproject/
  src/
    auth/
    db/
    utils/
  tests/
  config.py
</repo_structure>

<search_string>
Find where user authentication is implemented
</search_string>

Agent Response

The model responds with thinking and XML tool calls:
<think>
Looking for authentication. I'll grep for auth-related patterns
and explore the auth directory structure.
</think>

<grep>
  <pattern>authenticate</pattern>
  <sub_dir>src/</sub_dir>
</grep>

<list_directory>
  <path>src/auth</path>
</list_directory>

Tools

The agent uses four XML tools:
ToolPurposeKey Elements
grepSearch with regex<pattern>, <sub_dir>, <glob>
readRead file contents<path>, <lines>
list_directoryShow directory tree<path>, <pattern>
finishReturn final results<file> with <path> and <lines>

Usage Examples

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.MORPH_API_KEY,
  baseURL: "https://api.morphllm.com/v1",
});

// System prompt (abbreviated - see Direct API docs for full version)
const systemPrompt = `You are a code search agent. Your task is to find all relevant code for a given search_string.
You have exactly 4 turns. Use tools: grep, read, list_directory, finish.`;

// Initial user message with repo structure
const repoStructure = `<repo_structure>
myapp/
  src/
    auth/
    api/
    models/
  tests/
</repo_structure>

<search_string>
Find where JWT tokens are validated
</search_string>`;

const messages = [
  { role: "system", content: systemPrompt },
  { role: "user", content: repoStructure },
];

// First API call
const response = await openai.chat.completions.create({
  model: "morph-warp-grep-v1",
  messages,
  temperature: 0.0,
  max_tokens: 2048,
});

const agentResponse = response.choices[0].message.content;
// Parse XML tool calls, execute locally, continue loop...

Multi-Turn Flow

After the first response, parse XML tool calls and execute them locally:
// Parse tool calls from response
const toolCalls = parseXmlToolCalls(agentResponse);

// Execute each tool locally
const results = await Promise.all(
  toolCalls.map(async (call) => {
    switch (call.type) {
      case "grep":
        return executeGrep(call.pattern, call.subDir, call.glob);
      case "read":
        return readFile(call.path, call.lines);
      case "list_directory":
        return listDirectory(call.path);
      case "finish":
        return call; // Terminal - extract file locations
    }
  })
);

// Format results as XML and continue conversation
const toolResults = formatToolResults(results);
messages.push({ role: "assistant", content: agentResponse });
messages.push({ role: "user", content: toolResults + "\nYou have used 1 turn and have 3 remaining." });

// Next API call...

Result Format

When the agent calls finish, it returns file locations:
<finish>
  <file>
    <path>src/auth/jwt.ts</path>
    <lines>1-60</lines>
  </file>
  <file>
    <path>src/middleware/auth.ts</path>
    <lines>1-40</lines>
  </file>
</finish>
Read the specified line ranges to get the final code context.

Error Codes

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Code update request

model
enum<string>
default:morph-v3-fast
required

ID of the model to use

Available options:
morph-v3-fast,
morph-v3-large,
auto
Example:

"morph-v3-fast"

messages
object[]
required

Array containing a single user message with structured content using instruction-guided format

stream
boolean
default:false

Enable streaming response

Example:

false

max_tokens
integer

Maximum number of tokens to generate

Example:

150

temperature
number
default:0

Sampling temperature (0.0 for deterministic output)

Example:

0

Response

Chat completion response

id
string
required

Unique identifier for the completion

Example:

"chatcmpl-123"

object
string
required

Object type

Example:

"chat.completion"

created
integer
required

Unix timestamp of when the completion was created

Example:

1677652288

choices
object[]
required

List of completion choices

usage
object
required

Usage statistics for the completion request