Model Router - Morph Documentation

Not every prompt needs a $15/M-token model. A “fix this typo” request and a “design an event sourcing system” request look identical to your API call, but one costs 10x more than it should. The Morph Router classifies a prompt in ~50ms and tells you how to route it. Trained on millions of coding prompts. $0.005 per request. Pricing: $0.005/request | Max input: 8,192 tokens

Quick Start

Ask the router which model to use, then call it:

cURL
TypeScript
Python

curl -s -X POST "https://api.morphllm.com/v1/router/multimodel" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Add error handling to this function",
    "allowed_providers": ["anthropic"]
  }'

// Ask the router which model to use
const res = await fetch("https://api.morphllm.com/v1/router/multimodel", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.MORPH_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    input: "Add error handling to this function",
    allowed_providers: ["anthropic"],
  }),
});
const { model } = await res.json(); // call this model next

import requests

MORPH_API_KEY = "YOUR_API_KEY"

# Ask the router which model to use
resp = requests.post(
    "https://api.morphllm.com/v1/router/multimodel",
    headers={
        "Authorization": f"Bearer {MORPH_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "input": "Add error handling to this function",
        "allowed_providers": ["anthropic"],
    },
)
model = resp.json()["model"]  # call this model next

Prefer to map labels to your own models? Use /v1/router/classify to get the raw signals instead, and /v1/router/multimodel when you’d rather hand Morph your model list and let it choose.

/router/classify

Runs the requested classifier heads against your prompt and returns the raw labels. You decide what each label means for your stack. Request

Field	Type	Description
`input`	string	The prompt to classify (required).
`classes`	string[]	Which heads to run: `"difficulty"`, `"ambiguity"`, `"domain"`. Optional — defaults to all three.

cURL
Python
TypeScript

curl -s -X POST "https://api.morphllm.com/v1/router/classify" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Add error handling to this function",
    "classes": ["difficulty", "ambiguity", "domain"]
  }'

import requests

MORPH_API_KEY = "YOUR_API_KEY"

resp = requests.post(
    "https://api.morphllm.com/v1/router/classify",
    headers={
        "Authorization": f"Bearer {MORPH_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "input": "Add error handling to this function",
        "classes": ["difficulty", "ambiguity", "domain"],
    },
)
classifications = resp.json()["classifications"]
difficulty = classifications["difficulty"]["label"]

const res = await fetch("https://api.morphllm.com/v1/router/classify", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.MORPH_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    input: "Add error handling to this function",
    classes: ["difficulty", "ambiguity", "domain"],
  }),
});

const { classifications } = await res.json();
const difficulty = classifications.difficulty.label;

Response

{
  "classifications": {
    "difficulty": { "class_id": 0, "label": "easy",   "confidence": 0.93, "meets_threshold": true },
    "ambiguity":  { "class_id": 0, "label": "low",    "confidence": 0.88, "meets_threshold": true },
    "domain":     { "class_id": 2, "label": "coding", "confidence": 0.91, "meets_threshold": true }
  }
}

Each head returns label, class_id, confidence, and meets_threshold (whether confidence cleared the head’s threshold). When difficulty does not meet its threshold, treat it as needs_info — the prompt is too ambiguous to size confidently.

Labels

The classifier heads return these labels. You decide what each means for your stack. Difficulty

Label	What it means	Example mapping
`easy`	Trivial change, any model handles it	Haiku, DeepSeek Flash, Gemini Flash
`medium`	Moderate complexity, benefits from a capable model	Sonnet, GPT-5.5, Gemini Flash
`hard`	Complex task, needs a strong model	Opus, GPT-5.5, Gemini Pro
`needs_info`	Ambiguous prompt — difficulty didn’t clear the confidence threshold	Your default model

Ambiguity

Label	What it means
`low`	Well-specified request
`med`	Some detail missing
`high`	Underspecified; may need clarification

Domain

Label	What it means
`general`	General-purpose prompt
`summary`	Summarization / extraction
`coding`	Code generation or editing
`design`	Design / architecture
`data`	Data / analytics

/router/multimodel

Hand the router your candidate models (or whole providers) plus a policy. It classifies the prompt and returns the single best model to call — no mapping table to maintain. Request

Field	Type	Description
`input`	string	The prompt to route (required).
`allowed_models`	string[]	Restrict selection to these exact models, e.g. `["gpt-5.5", "claude-opus-4-8"]`. Optional.
`allowed_providers`	string[]	Restrict selection to these providers (`openai`, `anthropic`, `gemini`, `deepseek`). Optional.
`policy`	string	`"balanced"` (default), `"cost_efficient"`, `"capability_heavy"`, or `"domain_skills"`.
`default_model`	string	Fallback returned as-is when the prompt is too ambiguous to size (`needs_info`). Must satisfy the allowed filter. Optional.

allowed_models and allowed_providers are unioned — a model qualifies if it matches either. Leave both empty to consider every model in Morph’s catalog. The example below allows every Anthropic and Gemini model, plus the one specific OpenAI model gpt-5.5. Policies

balanced (default) — pick a capable model; break ties on cost.
cost_efficient — minimize cost; tolerate a slightly underqualified model.
capability_heavy — maximize capability for the request; never trade quality for cost.
domain_skills — route by domain match first; pick the specialist for the request’s domain.

Model catalog Leave allowed_models and allowed_providers empty to consider every model below.

Provider	Models
`openai`	`gpt-5.5`
`anthropic`	`claude-haiku-4-5-20251001`, `claude-sonnet-4-6`, `claude-opus-4-8`
`gemini`	`gemini-3.5-flash`, `gemini-3.1-pro-preview`
`deepseek`	`deepseek-v4-flash`, `deepseek-v4-pro`

cURL
Python
TypeScript

curl -s -X POST "https://api.morphllm.com/v1/router/multimodel" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Add error handling to this function",
    "allowed_providers": ["anthropic", "gemini"],
    "allowed_models": ["gpt-5.5"],
    "policy": "balanced",
    "default_model": "claude-sonnet-4-6"
  }'

import requests

MORPH_API_KEY = "YOUR_API_KEY"

resp = requests.post(
    "https://api.morphllm.com/v1/router/multimodel",
    headers={
        "Authorization": f"Bearer {MORPH_API_KEY}",
        "Content-Type": "application/json",
    },
    json={
        "input": "Add error handling to this function",
        "allowed_providers": ["anthropic", "gemini"],
        "allowed_models": ["gpt-5.5"],
        "policy": "balanced",
        "default_model": "claude-sonnet-4-6",
    },
)
model = resp.json()["model"]  # call this model next

const res = await fetch("https://api.morphllm.com/v1/router/multimodel", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.MORPH_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    input: "Add error handling to this function",
    allowed_providers: ["anthropic", "gemini"],
    allowed_models: ["gpt-5.5"],
    policy: "balanced",
    default_model: "claude-sonnet-4-6",
  }),
});

const { model } = await res.json(); // call this model next

Response

{
  "model": "claude-haiku-4-5-20251001",
  "provider": "anthropic",
  "difficulty": "easy",
  "confidence": 0.93,
  "ambiguity": "low",
  "ambiguity_confidence": 0.88,
  "domain": "coding",
  "domain_confidence": 0.91
}

model is what you call next. The classifier signals (difficulty, ambiguity, domain) are echoed back so you can act on them too — e.g. show a “let’s clarify” prompt when difficulty is needs_info. The ambiguity and domain fields are present only when those heads cleared their threshold; treat a missing field as “no signal.” If the prompt resolves to needs_info and you passed a default_model, that model is returned as-is.

Real-World Example

Route dynamically in production to cut costs while keeping quality. Hand the router your candidate models, then call whatever it returns:

TypeScript
Python

import OpenAI from 'openai';

const openai = new OpenAI();

async function handleUserRequest(userInput: string) {
  // 1. Ask the router which model to use
  const res = await fetch("https://api.morphllm.com/v1/router/multimodel", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.MORPH_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      input: userInput,
      allowed_models: ["gpt-5.5"],
      policy: "balanced",
      default_model: "gpt-5.5",
    }),
  });
  const { model } = await res.json();

  // 2. Call the chosen model with your provider SDK
  return await openai.chat.completions.create({
    model,
    messages: [{ role: "user", content: userInput }],
  });
}

// "Add a TODO comment"           → easy → gpt-5.5
// "Design event sourcing system" → hard → gpt-5.5

import os
import requests
from openai import OpenAI

openai = OpenAI()
MORPH_API_KEY = os.environ["MORPH_API_KEY"]

def handle_user_request(user_input: str):
    # 1. Ask the router which model to use
    resp = requests.post(
        "https://api.morphllm.com/v1/router/multimodel",
        headers={
            "Authorization": f"Bearer {MORPH_API_KEY}",
            "Content-Type": "application/json",
        },
        json={
            "input": user_input,
            "allowed_models": ["gpt-5.5"],
            "policy": "balanced",
            "default_model": "gpt-5.5",
        },
        timeout=5,
    )
    model = resp.json()["model"]

    # 2. Call the chosen model with your provider SDK
    return openai.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": user_input}],
    )

Wrap the router call in a try/catch and fall back to a safe default model if it ever fails — default_model already covers the needs_info case.

Edge / Cloudflare Workers

fetch is available natively at the edge, so you can call the router from a Cloudflare Worker, Vercel Edge Function, or Deno with no SDK:

export default {
  async fetch(request: Request, env: Env) {
    const { input } = await request.json();

    const res = await fetch("https://api.morphllm.com/v1/router/multimodel", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${env.MORPH_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ input, allowed_providers: ["anthropic"] }),
    });
    const { model } = await res.json();

    return Response.json({ model });
  }
};

The @morphllm/morphsdk/edge build ships a RawRouter helper, but it targets the legacy /router/raw endpoint. For the current endpoints, call them directly with fetch as shown above.

API Reference

Both endpoints are POST https://api.morphllm.com/... with an Authorization: Bearer YOUR_API_KEY header.

/router/classify
/router/multimodel

POST /v1/router/classify

Request:
{
  "input": "string",                                 // required
  "classes": ["difficulty", "ambiguity", "domain"]   // optional, defaults to all three
}

Response:
{
  "classifications": {
    "difficulty": { "class_id": 0, "label": "easy",   "confidence": 0.93, "meets_threshold": true },
    "ambiguity":  { "class_id": 0, "label": "low",    "confidence": 0.88, "meets_threshold": true },
    "domain":     { "class_id": 2, "label": "coding", "confidence": 0.91, "meets_threshold": true }
  }
}

POST /v1/router/multimodel

Request:
{
  "input": "string",                       // required
  "allowed_models": ["gpt-5.5"],           // optional
  "allowed_providers": ["anthropic"],      // optional: "openai" | "anthropic" | "gemini" | "deepseek"
  "policy": "balanced",                    // "balanced" (default) | "cost_efficient" | "capability_heavy" | "domain_skills"
  "default_model": "claude-sonnet-4-6"     // optional, returned as-is on needs_info
}

Response:
{
  "model": "claude-haiku-4-5-20251001",
  "provider": "anthropic",
  "difficulty": "easy",
  "confidence": 0.93,
  "ambiguity": "low",              // present only when the head clears its threshold
  "ambiguity_confidence": 0.88,
  "domain": "coding",
  "domain_confidence": 0.91
}

When to Use

Use the router when:

Processing varied user requests (simple typo fixes to complex architecture tasks)
You want to minimize API costs without manually classifying prompts
Building cost-conscious AI products with mixed complexity workloads

Skip the router when:

All tasks need the same model tier (e.g., always Opus for agentic coding)
The ~180ms routing latency matters more than cost savings
You need deterministic model selection for testing or compliance

Performance

Latency: ~180ms average
Parallel: Can run in parallel with other work
HTTP/2: Connection reuse for subsequent calls

Deprecated endpoints

/v1/router/raw and /v1/router/{provider} are superseded by /v1/router/classify and /v1/router/multimodel. They remain fully supported for backward compatibility — existing integrations keep working with no changes — but new code should use the endpoints above. The provider endpoints will be removed in a future release.

/router/raw

Returns just a difficulty label. Use /v1/router/classify instead for new code.

cURL
Python
TypeScript SDK

curl -s -X POST "https://api.morphllm.com/v1/router/raw" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Add error handling to this function",
    "mode": "balanced"
  }'

Returns: { "difficulty": "easy", "confidence": 0.93 }

import requests

MORPH_API_KEY = "YOUR_API_KEY"

resp = requests.post(
    "https://api.morphllm.com/v1/router/raw",
    headers={
        "Authorization": f"Bearer {MORPH_API_KEY}",
        "Content-Type": "application/json",
    },
    json={"input": "Add error handling to this function", "mode": "balanced"},
)
difficulty = resp.json()["difficulty"]

import { MorphClient } from '@morphllm/morphsdk';

const morph = new MorphClient({ apiKey: "YOUR_API_KEY" });

const { difficulty } = await morph.routers.raw.classify({
  input: 'Add error handling to this function',
  mode: 'balanced', // 'balanced' (default) | 'aggressive'
});

Modes — balanced (default) balances cost and quality; aggressive optimizes harder for cost, pushing more prompts to easy. Returns difficulty (easy | medium | hard | needs_info). For edge environments (Cloudflare Workers, Vercel Edge, Deno), use @morphllm/morphsdk/edge:

import { RawRouter } from '@morphllm/morphsdk/edge';

export default {
  async fetch(request: Request, env: Env) {
    const { input } = await request.json();
    const router = new RawRouter({ apiKey: env.MORPH_API_KEY });
    const { difficulty } = await router.classify({ input });
    return Response.json({ difficulty });
  }
};

/router/

Returns a provider-specific model name directly instead of a difficulty label, for openai, anthropic, and gemini. Use /v1/router/multimodel instead — it does the same model selection with control over the candidate set and policy. Under the hood these now call the multimodel router constrained to that provider, so they keep working with no changes on your side.

curl -s -X POST "https://api.morphllm.com/v1/router/anthropic" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"input": "your task", "mode": "balanced"}'

Returns: { "model": "claude-haiku-4-5-20251001", "confidence": 0.93 } The SDK still exposes morph.routers.anthropic.selectModel(), morph.routers.openai.selectModel(), and morph.routers.gemini.selectModel() for backwards compatibility. Migrate to /v1/router/multimodel.

​Quick Start

​/router/classify

​Labels

​/router/multimodel

​Real-World Example

​Edge / Cloudflare Workers

​API Reference

​When to Use

​Performance

​Deprecated endpoints

​/router/raw

​/router/

Quick Start

/router/classify

Labels

/router/multimodel

Real-World Example

Edge / Cloudflare Workers

API Reference

When to Use

Performance

Deprecated endpoints

/router/raw

/router/