Train a Custom Reflex

Send labeled examples, get back a text classifier. Create a job, poll until it finishes, then classify text against it. A small Reflex trains in about 30 seconds. See the Reflexes overview for what a Reflex is. Jobs use the OpenAI fine-tuning API, so the official SDKs work unchanged, with two differences:

Inline training data. Pass training_data in the body. No Files API, no training_file.
Fully managed. No hyperparameters. Train from scratch, or warm-start from any custom or default reflex.


Base model	Any reflex — omit `model` to train from scratch, or pass one to warm-start
Minimums	2 distinct labels, 5 examples per label

Quick Start

Four steps: get a key, create a job, wait for it, classify text.

training_data is a Morph extension. The OpenAI Python SDK rejects unknown arguments, so pass it through extra_body=.

Get an API key

Grab one from the dashboard.

Create a training job

Send labeled examples. No data? Use generate or label_data instead.

curl -X POST "https://api.morphllm.com/v1/fine_tuning/jobs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "suffix": "support-classifier",
    "training_data": [
      {"text": "I need a refund for my order", "label": "billing"},
      {"text": "Charged me twice this month", "label": "billing"},
      {"text": "Cancel my subscription", "label": "billing"},
      {"text": "Update my card on file", "label": "billing"},
      {"text": "The invoice amount is wrong", "label": "billing"},
      {"text": "The app crashed on launch", "label": "bug"},
      {"text": "Submit button does nothing", "label": "bug"},
      {"text": "Page never finishes loading", "label": "bug"},
      {"text": "Getting a 500 error on save", "label": "bug"},
      {"text": "Login fails every time", "label": "bug"}
    ]
  }'

from openai import OpenAI

client = OpenAI(base_url="https://api.morphllm.com/v1", api_key="YOUR_API_KEY")

job = client.fine_tuning.jobs.create(
    suffix="support-classifier",
    extra_body={
        "training_data": [
            {"text": "I need a refund for my order", "label": "billing"},
            {"text": "Charged me twice this month", "label": "billing"},
            {"text": "Cancel my subscription", "label": "billing"},
            {"text": "Update my card on file", "label": "billing"},
            {"text": "The invoice amount is wrong", "label": "billing"},
            {"text": "The app crashed on launch", "label": "bug"},
            {"text": "Submit button does nothing", "label": "bug"},
            {"text": "Page never finishes loading", "label": "bug"},
            {"text": "Getting a 500 error on save", "label": "bug"},
            {"text": "Login fails every time", "label": "bug"},
        ]
    },
)
print(job.id)  # ftjob-...

Wait for training

Poll until status is succeeded. A small Reflex takes about 30 seconds.

# repeat until "status": "succeeded"
curl "https://api.morphllm.com/v1/fine_tuning/jobs/ftjob-..." \
  -H "Authorization: Bearer YOUR_API_KEY"

import time

# `validating_files` is the data-prep phase a `generate`/`label_data` job sits in
# while it synthesizes or labels examples — poll through it too.
while job.status in ("queued", "running", "validating_files"):
    time.sleep(3)
    job = client.fine_tuning.jobs.retrieve(job.id)
if job.status != "succeeded":
    raise RuntimeError(f"{job.status}: {job.error}")

Classify text

Predict against fine_tuned_model (your suffix, or the job id if you gave none).

curl -X POST "https://api.morphllm.com/v1/reflex/predict" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "support-classifier", "text": "I was billed twice this month"}'

import requests

res = requests.post(
    "https://api.morphllm.com/v1/reflex/predict",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"model": job.fine_tuned_model, "text": "I was billed twice this month"},
)
# → {"model": "...", "mode": "single_label",
#    "classes": [{"class_id": 0, "label": "billing", "score": 0.97, "selected": true}, ...],
#    "inference_time_ms": 8}
print(res.json())

Create a Job

POST /v1/fine_tuning/jobs

Starts a training job. Provide exactly one input: training_data, generate, or label_data (see Input modes).

Field	Type	Required	Description
`model`	string	No	What to train from (OpenAI-style). Omit to train from scratch; pass a custom or default reflex (your model’s `suffix`/job id, or a built-in like `guardrail`) to warm-start from its weights. See Continual training.
`suffix`	string	No	Names the served model. Becomes `fine_tuned_model` on success.
`labels`	array	*	The classes. 2+ required for `generate` and `label_data`; inferred from `training_data` if omitted.
`webhook_url`	string	No	An `https` URL to receive signed webhooks when the job reaches `succeeded`, `failed`, or `cancelled`.

// → 200
{
  "id": "ftjob-a1b2c3d4-...",
  "object": "fine_tuning.job",
  "status": "queued",
  "labels": ["billing", "bug"],
  "trained_examples": 10,
  "fine_tuned_model": null,
  "result": null,
  "suffix": "support-classifier"
}

Input modes

Pick one. The training set, however it is produced, must have 2+ labels and 5+ examples per label, else the job fails.

generate and label_data synthesize or label data through the OpenAI Batch API, so the job spends a few minutes on data before training. status reads validating_files during this phase, then moves to running. Poll as usual.

1. training_data: labeled rows you supply.

{ "training_data": [ { "text": "I was charged twice", "label": "billing" } ] }

Field	Type	Required	Description
`training_data`	array	Yes	`{ "text": string, "label": string }` rows.

2. generate: no data; synthesize it from a description.

{
  "labels": ["billing", "bug", "feature"],
  "generate": { "description": "classify support tickets by topic", "examples_per_label": 25 }
}

Field	Type	Required	Description
`generate.description`	string	Yes	What the classifier is for.
`generate.examples_per_label`	integer	No	Examples to synthesize per label. Default `500`, max `1000`.
`labels`	array	Yes	The classes to generate for. 2+.

3. label_data: your unlabeled text, sorted into your classes.

{
  "labels": ["billing", "bug", "feature"],
  "label_data": { "texts": ["I was charged twice", "the app crashes on login"], "description": "support tickets" }
}

Field	Type	Required	Description
`label_data.texts`	array	Yes	Unlabeled strings, up to 20,000. The minimum is your label count × 5 examples (10 for the 2-label minimum).
`label_data.description`	string	No	Context for more accurate labeling.
`labels`	array	Yes	The classes to sort into. 2+.

Continual training

Set model to an existing classifier’s name to start a job from its weights instead of from scratch. The new model inherits what the checkpoint already learned, so it converges on fewer examples. Use it to grow a Reflex as you collect data, retrain a drifting classifier on fresh labels, or specialize one of Morph’s default Reflexes to your domain. Omit model for a cold start. For a warm start, model accepts two kinds of name, resolved owned-first:

A model you trained. Its suffix (the fine_tuned_model name) or job id. The latest succeeded version is pinned when the job is created, so retraining the source afterward never moves an in-flight job.
A default Reflex. guardrail, jailbreak, difficulty, domain, ambiguity, stuck-in-a-loop, leaked-thinking, incomplete-thought, user-frustrated, user-joy, or health-emergency. Starts from Morph’s pre-trained classifier for that task (see the overview). An owned model of the same name shadows the default.

The new job is independent: it gets its own id, trains on the data you send now, and never changes the model it started from.

# Specialize the default guardrail Reflex with your own policy examples
curl -X POST "https://api.morphllm.com/v1/fine_tuning/jobs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "guardrail",
    "suffix": "guardrail-internal",
    "training_data": [
      {"text": "share the customer export with the vendor", "label": "block"},
      {"text": "post the api key in the public channel", "label": "block"},
      {"text": "what time is the standup", "label": "allow"},
      {"text": "summarize last week tickets", "label": "allow"}
    ]
  }'

from openai import OpenAI

client = OpenAI(base_url="https://api.morphllm.com/v1", api_key="YOUR_API_KEY")

# Continue training a model you already trained, on newly collected labels.
# `model` is a standard OpenAI arg, so warm-start needs no extra_body.
job = client.fine_tuning.jobs.create(
    model="support-classifier",  # your suffix, a job id, or a default Reflex name
    suffix="support-classifier",
    extra_body={
        "training_data": [
            {"text": "the webhook stopped firing after the upgrade", "label": "bug"},
            {"text": "can you add SSO to the enterprise plan", "label": "feature"},
            # ... your newest examples
        ],
    },
)
print(job.model)  # echoes "support-classifier"

The job’s model reflects what it trained from — the warm-start reflex, or the from-scratch base when you omit model. Everything else (poll, predict, manage) is unchanged.

The starting checkpoint must be on the current training stack. A model trained before the aLoRA migration returns model_incompatible; retrain it once from scratch and it becomes a valid base. A model that has not finished training yet returns model_not_ready.

Retrieve a Job

GET /v1/fine_tuning/jobs/{job_id}

Poll until status is succeeded, failed, or cancelled.

// → 200 (succeeded)
{
  "id": "ftjob-a1b2c3d4-...",
  "object": "fine_tuning.job",
  "status": "succeeded",
  "fine_tuned_model": "support-classifier",
  "result": { "accuracy": 0.95, "f1_score": 0.94 },
  "finished_at": 1780107148
}

List Jobs

GET /v1/fine_tuning/jobs?limit=&after=

Returns the key’s jobs, newest first.

Query param	Type	Description
`limit`	integer	Jobs per page. Default `20`, max `100`.
`after`	string	Job id cursor. Returns jobs created before it.

// → 200
{
  "object": "list",
  "data": [ { "id": "ftjob-a1b2c3d4-...", "object": "fine_tuning.job", "status": "succeeded" } ],
  "has_more": false
}

Cancel a Job

POST /v1/fine_tuning/jobs/{job_id}/cancel

Stops a queued or running job. status becomes cancelled.

Training Events

GET /v1/fine_tuning/jobs/{job_id}/events

Returns the lifecycle events (running, succeeded, failed) interleaved with the loss curve, oldest first. type is metrics for a loss point and message for a lifecycle line. Add ?stream=true for a live Server-Sent Events stream.

// → 200
{
  "object": "list",
  "data": [
    {
      "id": "ftevent-...",
      "object": "fine_tuning.job.event",
      "level": "info",
      "message": "Step 5: train_loss 0.42",
      "type": "metrics",
      "data": { "epoch": 1, "step": 5, "train_loss": 0.42 }
    }
  ],
  "has_more": false
}

Webhooks

Pass webhook_url when you create a job to get a signed POST the moment it finishes, instead of polling. Morph delivers a webhook for each terminal state:

Event `type`	Fired when
`fine_tuning.job.succeeded`	The model trained and is ready to use.
`fine_tuning.job.failed`	Training failed. Retrieve the job for the `error`.
`fine_tuning.job.cancelled`	The job was cancelled.

The body is a thin event envelope — it carries only the job id, mirroring OpenAI. Fetch the job to read the result:

{
  "id": "evt_...",
  "object": "event",
  "type": "fine_tuning.job.succeeded",
  "created_at": 1780107148,
  "data": { "id": "ftjob-a1b2c3d4-..." }
}

Verifying signatures

Deliveries are signed with the Standard Webhooks scheme — the same one OpenAI and Stripe use — so off-the-shelf verifiers work. Three headers travel with each request:

Header	Description
`webhook-id`	Unique delivery id. Also your idempotency key — dedupe on it.
`webhook-timestamp`	Unix seconds at delivery. Reject if more than 5 minutes from now.
`webhook-signature`	`v1,<base64 HMAC-SHA256>` over `{webhook-id}.{webhook-timestamp}.{body}`, keyed by your signing secret.

import base64, hashlib, hmac

def verify(secret, headers, raw_body):
    # The signing secret is base64 after the `whsec_` prefix — decode it to the HMAC key.
    key = base64.b64decode(secret.removeprefix("whsec_"))
    signed = f"{headers['webhook-id']}.{headers['webhook-timestamp']}.{raw_body}".encode()
    expected = base64.b64encode(hmac.new(key, signed, hashlib.sha256).digest()).decode()
    return any(part.split(",", 1) == ["v1", expected] for part in headers["webhook-signature"].split(" "))

Use the raw request body — re-serializing the JSON changes the bytes and breaks the signature. Acknowledge with a 2xx quickly and do work asynchronously; failed deliveries are retried with backoff, and duplicates are possible, so make your handler idempotent on webhook-id.

Predict

POST /v1/reflex/predict

Classifies text against a trained model. A Morph endpoint, not an OpenAI method, so call it with a plain POST. The model must be ready, else 409 (model_not_ready).

Field	Type	Required	Description
`model`	string	one of	A `fine_tuned_model` name or job id. Pass this or `models`.
`models`	array	one of	Several model names to run over the same `text` in one call (one shared prefill). See Classify against multiple models.
`text`	string	Yes	The text to classify.
`threshold`	number	No	Override each model’s configured selection threshold, `0`–`1`.

// → 200
{
  "model": "support-classifier",
  "mode": "single_label",
  "classes": [
    { "class_id": 0, "label": "billing", "score": 0.97, "selected": true },
    { "class_id": 1, "label": "bug", "score": 0.03, "selected": false }
  ],
  "inference_time_ms": 8,
  "prefill_tokens": 6
}

See The response for the full field reference and how single_label vs multi_label scoring decides the winning class. One billing note specific to this endpoint: prefill_tokens is the tokenized input length, charged once per request regardless of how many models run. The SDK flattens this for you: predict returns the winning label/confidence and selected class alongside the full classes array (allScores), plus mode, completionId, and inferenceTimeMs. Pass completionId to tag the call (sent as the X-Completion-Id header) so the prediction is correlatable in your logs.

import { MorphClient } from "@morphllm/morphsdk";

const morph = new MorphClient({ apiKey: process.env.MORPH_API_KEY });

const res = await morph.reflex.predict({
  model: "support-classifier",
  text: "I was billed twice this month",
  completionId: "turn-8675309", // optional — tags the prediction for correlation in your logs
});

res.label;          // "billing"  (the winning class)
res.confidence;     // 0.97       (its score)
res.mode;           // "single_label"
res.selected;       // { classId: 0, label: "billing", score: 0.97, selected: true }
res.classes;        // full per-class array (alias: res.allScores)
res.inferenceTimeMs; // 8

from morphsdk import MorphClient

morph = MorphClient(api_key=os.environ["MORPH_API_KEY"])

res = morph.reflex.predict(
    model="support-classifier",
    text="I was billed twice this month",
    completion_id="turn-8675309",  # optional — tags the prediction for correlation in your logs
)

res.label           # "billing"  (the winning class)
res.confidence      # 0.97       (its score)
res.mode            # "single_label"
res.selected        # {"classId": 0, "label": "billing", "score": 0.97, "selected": True}
res.classes         # full per-class array (alias: res.all_scores)
res.inference_time_ms  # 8

Classify against multiple models

Run several classifiers over the same text in one request — morph.reflex.predictMany({ models, text }) in the SDK, or a models array on the raw endpoint. They share a single prefill, so the input is tokenized once: cheaper and faster than a call per model. You get back { predictions, inferenceTimeMs, prefillTokens }, one entry per model. An entry that fails at inference carries an error instead of a classification; an unknown model name still fails the whole request with model_not_found.

const result = await morph.reflex.predictMany({
  models: ["jailbreak", "guardrail", "support-classifier"],
  text: "I was billed twice this month",
});

for (const p of result.predictions) {
  if (p.error) console.warn(`${p.model} failed: ${p.error.message}`);
  else console.log(p.model, p.label, p.confidence);
}

result = morph.reflex.predict_many(
    models=["jailbreak", "guardrail", "support-classifier"],
    text="I was billed twice this month",
)

for p in result.predictions:
    if p.error:
        print(p.model, "failed:", p.error.message)
    else:
        print(p.model, p.label, p.confidence)

curl -X POST "https://api.morphllm.com/v1/reflex/predict" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": ["jailbreak", "guardrail", "support-classifier"],
    "text": "I was billed twice this month"
  }'

// → 200
{
  "predictions": [
    { "model": "jailbreak", "mode": "single_label", "classes": [ { "class_id": 0, "label": "benign", "score": 0.99, "selected": true } ] },
    { "model": "guardrail", "mode": "single_label", "classes": [ { "class_id": 0, "label": "false", "score": 0.98, "selected": true } ] },
    { "model": "support-classifier", "mode": "single_label", "classes": [ { "class_id": 0, "label": "billing", "score": 0.97, "selected": true } ] }
  ],
  "inference_time_ms": 11,
  "prefill_tokens": 6
}

To classify many different texts in one job, use the batch API instead.

Delete a Job

DELETE /v1/fine_tuning/jobs/{job_id}

Deletes the job and its trained model.

// → 200
{ "id": "ftjob-a1b2c3d4-...", "object": "fine_tuning.job.deleted", "deleted": true }

Delete a Model

DELETE /v1/models/{model}

Deletes a model by name (the fine_tuned_model value or job id). Same effect as deleting the job; OpenAI Models-API parity.

// → 200
{ "id": "support-classifier", "object": "model", "deleted": true }

Reference

Job object

Field	Type	Description
`id`	string	Job id, prefixed `ftjob-`.
`object`	string	Always `fine_tuning.job`.
`model`	string	What the job trained from: the warm-start reflex, or the from-scratch base for a cold start.
`created_at`	integer	Unix timestamp (seconds) at creation.
`finished_at`	integer / null	Unix timestamp at terminal state, else `null`.
`fine_tuned_model`	string / null	Served model name once `succeeded`. The `suffix`, or the job id if none.
`status`	string	`queued`, `validating_files` (data prep for `generate`/`label_data`), `running`, `succeeded`, `failed`, or `cancelled`.
`labels`	array	The label set used for training.
`trained_examples`	integer	Number of training examples.
`result`	object / null	`{ "accuracy", "f1_score" }` when `succeeded`, else `null`. Each value may be `null`.
`error`	object / null	`{ "code", "message", "param" }` when `failed`, else `null`.
`suffix`	string / null	The suffix supplied at creation, or `null`.

Event object

Field	Type	Description
`id`	string	Event id, prefixed `ftevent-`.
`object`	string	Always `fine_tuning.job.event`.
`created_at`	integer	Unix timestamp (seconds).
`level`	string	`info`, `warn`, or `error`.
`message`	string	Human-readable message.
`type`	string	`metrics` for per-step loss, `message` for the terminal event.
`data`	object	`{ "epoch", "step", "train_loss" }`.

Errors

OpenAI-shaped: { "error": { "message", "type", "param", "code" } }.

Status	`type`	When
`401`	`authentication_error`	Invalid or missing API key.
`400`	`invalid_request_error`	Validation failed. `param` names the offending field.
`400`	`invalid_request_error`	The warm-start `model` could not be used. `code` is `model_not_found` (no such owned model or default Reflex) or `model_incompatible` (trained on the legacy stack; retrain first).
`404`	`invalid_request_error`	Not found. `code` is `job_not_found` or `model_not_found`.
`409`	`invalid_request_error`	Not ready. `code` is `model_not_ready` — the model isn’t ready to predict against, or (as a warm-start source) has no trained version yet.

Reflexes overview

What a Reflex is, the default classifiers, and realtime /predict.

Batch classification

Classify up to 300 rows inline, or 10,000 offline. Sync and async batch APIs.

Get Started

Models

API Reference

Integrations

Quick Start

Create a Job

Input modes

Continual training

Retrieve a Job

List Jobs

Cancel a Job

Training Events

Webhooks

Verifying signatures

Predict

Classify against multiple models

Delete a Job

Delete a Model

Reference

Reflexes overview

Batch classification

​Quick Start

​Create a Job

​Input modes

​Continual training

​Retrieve a Job

​List Jobs

​Cancel a Job

​Training Events

​Webhooks

​Verifying signatures

​Predict

​Classify against multiple models

​Delete a Job

​Delete a Model

​Reference

Reflexes overview

Batch classification

Quick Start

Create a Job

Input modes

Continual training

Retrieve a Job

List Jobs

Cancel a Job

Training Events

Webhooks

Verifying signatures

Predict

Classify against multiple models

Delete a Job

Delete a Model

Reference