Skip to main content
Two endpoints. Same model, different response shapes.
  • POST /v1/responses returns compressed text in OpenAI Responses format. Works with any OpenAI SDK.
  • POST /v1/compress returns compressed messages with compacted_line_ranges per message, so you know exactly which lines were removed.

POST /v1/responses

OpenAI-compatible. Point your SDK at https://api.morphllm.com/v1 and call responses.create().

Parameters

ParameterTypeRequiredDescription
modelstringYesmorph-compactor
inputstring or arrayYesText or {role, content} array
instructionsstringNoGuides what to keep. Maps to prompt in the Morph SDK.

Response

{
  "id": "resp_abc123",
  "output": [{
    "type": "message",
    "role": "assistant",
    "content": [{ "type": "output_text", "text": "compressed text..." }]
  }],
  "usage": { "input_tokens": 4200, "output_tokens": 1800, "total_tokens": 6000 },
  "model": "morph-compactor"
}

Examples

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.morphllm.com/v1",
});

const response = await client.responses.create({
  model: "morph-compactor",
  input: chatHistory,
  instructions: "The user is about to ask about JWT validation",
});

console.log(response.output[0].content[0].text);

POST /v1/compress

POST https://api.morphllm.com/v1/compress Returns per-message compacted_line_ranges: the 1-indexed, inclusive line ranges that were replaced with (filtered N lines) markers. If a gap between kept blocks was too small to be worth a marker, the lines stay and no range is emitted.

Parameters

ParameterTypeRequiredDescription
messagesarrayYes{role, content} messages to compress
compression_ratiofloatNoFraction to keep. 0.3 = aggressive, 0.7 = light. Default 0.5.
preserve_recentintNoKeep last N messages uncompressed. Default 2.
querystringNoFocus query. Auto-detected from last user message if omitted.
modelstringNoDefault swe-pruner-0.6b

Response

{
  "id": "cmpr-7373faf8af65",
  "object": "compression",
  "model": "swe-pruner-0.6b",
  "messages": [
    {
      "role": "user",
      "content": "def hello():\n    print(\"hello world\")\n    x = 1\n(filtered 6 lines)\ndef world():\n    return 42",
      "compacted_line_ranges": [
        { "start": 5, "end": 10 }
      ]
    }
  ],
  "usage": {
    "input_tokens": 101,
    "output_tokens": 65,
    "compression_ratio": 0.644,
    "processing_time_ms": 109
  }
}

Examples

import { MorphClient } from '@morphllm/morphsdk';

const morph = new MorphClient({ apiKey: "YOUR_API_KEY" });

const result = await morph.responses.compress({
  messages: [{ role: "user", content: codeFile }],
  compression_ratio: 0.5,
  preserve_recent: 0,
  query: "authentication middleware",
});

for (const msg of result.messages) {
  console.log(`[${msg.role}] ${msg.content.length} chars`);
  for (const r of msg.compacted_line_ranges) {
    console.log(`  lines ${r.start}-${r.end} removed`);
  }
}

Errors

StatusMeaning
400Malformed request or input too large
401Invalid API key
503Model not loaded
504Request timed out