> ## Documentation Index
> Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Compact API

> Compress chat history and code context at 33,000 tok/s with byte-identical output

## Overview

Compact compresses chat history and code context at **33,000 tok/s** by removing irrelevant lines. Every surviving line is byte-for-byte identical to the original input. 100K tokens compresses in under 2 seconds.

Pass `query` to tell the model what matters for the next LLM call. Without it, the model auto-detects from the last user message.

## Usage Examples

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { MorphClient } from '@morphllm/morphsdk';

  const morph = new MorphClient({ apiKey: "YOUR_API_KEY" });

  const result = await morph.compact({
    input: chatHistory,
    query: "How do I validate JWT tokens?",
    compressionRatio: 0.5,
    preserveRecent: 3,
  });

  // result.output is the compressed text — pass it to your LLM
  ```

  ```python Python (OpenAI SDK) theme={null}
  from openai import OpenAI

  client = OpenAI(
      api_key="YOUR_API_KEY",
      base_url="https://api.morphllm.com/v1",
  )

  response = client.chat.completions.create(
      model="morph-compactor",
      messages=[{"role": "user", "content": chat_history}],
  )

  compressed = response.choices[0].message.content
  ```

  ```python Python (requests) theme={null}
  import requests

  response = requests.post(
      "https://api.morphllm.com/v1/compact",
      headers={"Authorization": "Bearer YOUR_API_KEY"},
      json={
          "input": source_code,
          "query": "authentication",
          "compression_ratio": 0.5,
          "preserve_recent": 0,
      },
  )

  data = response.json()
  print(data["output"])

  for r in data["messages"][0]["compacted_line_ranges"]:
      print(f"  lines {r['start']}-{r['end']} removed")
  ```

  ```bash cURL theme={null}
  curl -X POST "https://api.morphllm.com/v1/compact" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "input": "def hello():\n    return 1\n\ndef unused():\n    pass\n\ndef world():\n    return 2",
      "query": "hello function",
      "compression_ratio": 0.5,
      "preserve_recent": 0
    }'
  ```
</CodeGroup>

## keepContext Tags

Wrap sections you never want compressed in `<keepContext>` / `</keepContext>` tags. Tagged content survives compression verbatim regardless of the compression ratio.

```
<keepContext>
// CRITICAL: Auth middleware — do not compress
function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  req.user = jwt.verify(token, process.env.JWT_SECRET);
  next();
}
</keepContext>
```

The response includes `kept_line_ranges` showing which lines were force-preserved.

## Compatible Endpoints

Compact also works through OpenAI-compatible endpoints with `model: "morph-compactor"`:

| Endpoint                    | Format                  | Use with                                     |
| --------------------------- | ----------------------- | -------------------------------------------- |
| `POST /v1/compact`          | Native Morph format     | Direct HTTP, Morph SDK                       |
| `POST /v1/responses`        | OpenAI Responses API    | Any OpenAI SDK (`client.responses.create()`) |
| `POST /v1/chat/completions` | OpenAI Chat Completions | Any OpenAI-compatible client                 |

See the full [Compact documentation](/sdk/components/compact) for SDK reference, best practices, and advanced usage.
