Skip to main content
Prompt for your coding agent
Read https://docs.morphllm.com/sdk/components/reflexes/custom and train a custom Reflex for our classification task. Search this codebase for where we currently classify, filter, or route text (regexes, keyword lists, LLM-as-judge calls) and collect labeled examples from that path. Create the fine-tuning job with inline training_data on base model morph-reflex-v1, poll until it finishes, then swap the old logic for POST /v1/reflex/predict. Plan first, then implement and verify accuracy on held-out examples.
Send labeled examples, get back a text classifier. Create a job, poll until it finishes, then classify text against it. A small Reflex trains in about 30 minutes. See the Reflexes overview for what a Reflex is. Jobs use the OpenAI fine-tuning API, so the official SDKs work unchanged, with two differences:
  • Inline training data. Pass training_data in the body. No Files API, no training_file.
  • Fully managed. No hyperparameters. One base model, morph-reflex-v1.
Base modelmorph-reflex-v1 (default, optional on create)
Minimums2 distinct labels, 5 examples per label
Concurrency3 jobs run at once per key; the rest queue

Quick Start

Four steps: get a key, create a job, wait for it, classify text.
training_data is a Morph extension. The OpenAI Python SDK rejects unknown arguments, so pass it through extra_body=.
1

Get an API key

Grab one from the dashboard.
2

Create a training job

Send labeled examples. No data? Use generate or label_data instead.
curl -X POST "https://api.morphllm.com/v1/fine_tuning/jobs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "morph-reflex-v1",
    "suffix": "support-classifier",
    "training_data": [
      {"text": "I need a refund for my order", "label": "billing"},
      {"text": "Charged me twice this month", "label": "billing"},
      {"text": "Cancel my subscription", "label": "billing"},
      {"text": "Update my card on file", "label": "billing"},
      {"text": "The invoice amount is wrong", "label": "billing"},
      {"text": "The app crashed on launch", "label": "bug"},
      {"text": "Submit button does nothing", "label": "bug"},
      {"text": "Page never finishes loading", "label": "bug"},
      {"text": "Getting a 500 error on save", "label": "bug"},
      {"text": "Login fails every time", "label": "bug"}
    ]
  }'
3

Wait for training

Poll until status is succeeded. A small Reflex takes about 30 seconds.
# repeat until "status": "succeeded"
curl "https://api.morphllm.com/v1/fine_tuning/jobs/ftjob-..." \
  -H "Authorization: Bearer YOUR_API_KEY"
4

Classify text

Predict against fine_tuned_model (your suffix, or the job id if you gave none).
curl -X POST "https://api.morphllm.com/v1/reflex/predict" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "support-classifier", "text": "I was billed twice this month"}'

Create a Job

POST /v1/fine_tuning/jobs
Starts a training job. Provide exactly one input: training_data, generate, or label_data (see Input modes).
FieldTypeRequiredDescription
modelstringNomorph-reflex-v1 (default and only value).
base_modelstringNoStart from an existing classifier’s weights instead of from scratch. A model you trained or a default Reflex name. Omit for a cold start. See Continual training.
suffixstringNoNames the served model. Becomes fine_tuned_model on success.
labelsarray*The classes. 2+ required for generate and label_data; inferred from training_data if omitted.
webhook_urlstringNoAn https URL to receive signed webhooks when the job reaches succeeded, failed, or cancelled.
// → 200
{
  "id": "ftjob-a1b2c3d4-...",
  "object": "fine_tuning.job",
  "model": "morph-reflex-v1",
  "status": "queued",
  "labels": ["billing", "bug"],
  "trained_examples": 10,
  "fine_tuned_model": null,
  "result": null,
  "suffix": "support-classifier"
}

Input modes

Pick one. The training set, however it is produced, must have 2+ labels and 5+ examples per label, else the job fails.
generate and label_data synthesize or label data through the OpenAI Batch API, so the job spends a few minutes on data before training. status stays running the whole time. Poll as usual.
1. training_data: labeled rows you supply.
{ "training_data": [ { "text": "I was charged twice", "label": "billing" } ] }
FieldTypeRequiredDescription
training_dataarrayYes{ "text": string, "label": string } rows.
2. generate: no data; synthesize it from a description.
{
  "labels": ["billing", "bug", "feature"],
  "generate": { "description": "classify support tickets by topic", "examples_per_label": 25 }
}
FieldTypeRequiredDescription
generate.descriptionstringYesWhat the classifier is for.
generate.examples_per_labelintegerNoExamples to synthesize per label. Default 500, max 1000.
labelsarrayYesThe classes to generate for. 2+.
3. label_data: your unlabeled text, sorted into your classes.
{
  "labels": ["billing", "bug", "feature"],
  "label_data": { "texts": ["I was charged twice", "the app crashes on login"], "description": "support tickets" }
}
FieldTypeRequiredDescription
label_data.textsarrayYes10 to 20,000 unlabeled strings.
label_data.descriptionstringNoContext for more accurate labeling.
labelsarrayYesThe classes to sort into. 2+.

Continual training

Pass base_model to start a job from an existing classifier’s weights instead of from scratch. The new model inherits what the checkpoint already learned, so it converges on fewer examples. Use it to grow a Reflex as you collect data, retrain a drifting classifier on fresh labels, or specialize one of Morph’s default Reflexes to your domain. Omit base_model for a cold start. base_model takes two kinds of name, resolved owned-first:
  • A model you trained. Its suffix (the fine_tuned_model name) or job id. The latest succeeded version is pinned when the job is created, so retraining the source afterward never moves an in-flight job.
  • A default Reflex. guardrail, jailbreak, difficulty, domain, ambiguity, stuck-in-a-loop, leaked-thinking, or incomplete-thought. Starts from Morph’s pre-trained classifier for that task (see the overview). An owned model of the same name shadows the default.
This is distinct from model, which selects the architecture (morph-reflex-v1). base_model selects the weights to warm-start from. The new job is independent: it gets its own id, trains on the data you send now, and never changes the model it started from.
# Specialize the default guardrail Reflex with your own policy examples
curl -X POST "https://api.morphllm.com/v1/fine_tuning/jobs" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "base_model": "guardrail",
    "suffix": "guardrail-internal",
    "training_data": [
      {"text": "share the customer export with the vendor", "label": "block"},
      {"text": "post the api key in the public channel", "label": "block"},
      {"text": "what time is the standup", "label": "allow"},
      {"text": "summarize last week tickets", "label": "allow"}
    ]
  }'
The job echoes the base_model you sent. Everything else (poll, predict, manage) is unchanged.
The starting checkpoint must be on the current training stack. A model trained before the aLoRA migration returns base_model_incompatible; retrain it once from scratch and it becomes a valid base. A model that has not finished training yet returns base_model_not_ready.

Retrieve a Job

GET /v1/fine_tuning/jobs/{job_id}
Poll until status is succeeded, failed, or cancelled.
// → 200 (succeeded)
{
  "id": "ftjob-a1b2c3d4-...",
  "object": "fine_tuning.job",
  "status": "succeeded",
  "fine_tuned_model": "support-classifier",
  "result": { "accuracy": 0.95, "f1_score": 0.94 },
  "finished_at": 1780107148
}

List Jobs

GET /v1/fine_tuning/jobs?limit=&after=
Returns the key’s jobs, newest first.
Query paramTypeDescription
limitintegerJobs per page. Default 20, max 100.
afterstringJob id cursor. Returns jobs created before it.
// → 200
{
  "object": "list",
  "data": [ { "id": "ftjob-a1b2c3d4-...", "object": "fine_tuning.job", "status": "succeeded" } ],
  "has_more": false
}

Cancel a Job

POST /v1/fine_tuning/jobs/{job_id}/cancel
Stops a queued or running job. status becomes cancelled.

Training Events

GET /v1/fine_tuning/jobs/{job_id}/events
Returns the lifecycle events (running, succeeded, failed) interleaved with the loss curve, oldest first. type is metrics for a loss point and message for a lifecycle line. Add ?stream=true for a live Server-Sent Events stream.
// → 200
{
  "object": "list",
  "data": [
    {
      "id": "ftevent-...",
      "object": "fine_tuning.job.event",
      "level": "info",
      "message": "Step 5: train_loss 0.42",
      "type": "metrics",
      "data": { "epoch": 1, "step": 5, "train_loss": 0.42 }
    }
  ],
  "has_more": false
}

Webhooks

Pass webhook_url when you create a job to get a signed POST the moment it finishes, instead of polling. Morph delivers a webhook for each terminal state:
Event typeFired when
fine_tuning.job.succeededThe model trained and is ready to use.
fine_tuning.job.failedTraining failed. Retrieve the job for the error.
fine_tuning.job.cancelledThe job was cancelled.
The body is a thin event envelope — it carries only the job id, mirroring OpenAI. Fetch the job to read the result:
{
  "id": "evt_...",
  "object": "event",
  "type": "fine_tuning.job.succeeded",
  "created_at": 1780107148,
  "data": { "id": "ftjob-a1b2c3d4-..." }
}

Verifying signatures

Deliveries are signed with the Standard Webhooks scheme — the same one OpenAI and Stripe use — so off-the-shelf verifiers work. Three headers travel with each request:
HeaderDescription
webhook-idUnique delivery id. Also your idempotency key — dedupe on it.
webhook-timestampUnix seconds at delivery. Reject if more than 5 minutes from now.
webhook-signaturev1,<base64 HMAC-SHA256> over {webhook-id}.{webhook-timestamp}.{body}, keyed by your signing secret.
import base64, hashlib, hmac

def verify(secret, headers, raw_body):
    # The signing secret is base64 after the `whsec_` prefix — decode it to the HMAC key.
    key = base64.b64decode(secret.removeprefix("whsec_"))
    signed = f"{headers['webhook-id']}.{headers['webhook-timestamp']}.{raw_body}".encode()
    expected = base64.b64encode(hmac.new(key, signed, hashlib.sha256).digest()).decode()
    return any(part.split(",", 1) == ["v1", expected] for part in headers["webhook-signature"].split(" "))
Use the raw request body — re-serializing the JSON changes the bytes and breaks the signature. Acknowledge with a 2xx quickly and do work asynchronously; failed deliveries are retried with backoff, and duplicates are possible, so make your handler idempotent on webhook-id.

Predict

POST /v1/reflex/predict
Classifies text against a trained model. A Morph endpoint, not an OpenAI method, so call it with a plain POST. The model must be ready, else 409 (model_not_ready).
FieldTypeRequiredDescription
modelstringYesA fine_tuned_model name or job id.
textstringYesThe text to classify.
// → 200
{
  "model": "support-classifier",
  "mode": "single_label",
  "classes": [
    { "class_id": 0, "label": "billing", "score": 0.97, "selected": true },
    { "class_id": 1, "label": "bug", "score": 0.03, "selected": false }
  ],
  "inference_time_ms": 8
}
classes is one entry per label. mode is single_label (softmax, scores sum to 1, one selected) or multi_label (independent sigmoid scores, any number selected). The winning label is the selected class with the highest score.

Classify against multiple models

predict takes one model per request. To run several of your classifiers over the same text, make one call per model and fire them in parallel.
import requests, concurrent.futures

def predict(model, text):
    return requests.post(
        "https://api.morphllm.com/v1/reflex/predict",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        json={"model": model, "text": text},
    ).json()

text = "I was billed twice this month"
models = ["billing-classifier", "urgency-classifier", "topic-classifier"]
with concurrent.futures.ThreadPoolExecutor() as ex:
    results = list(ex.map(lambda m: predict(m, text), models))
Each call is a separate request, independently subject to your per-key concurrency limit. There is no batched multi-model endpoint today.

Delete a Job

DELETE /v1/fine_tuning/jobs/{job_id}
Deletes the job and its trained model.
// → 200
{ "id": "ftjob-a1b2c3d4-...", "object": "fine_tuning.job.deleted", "deleted": true }

Delete a Model

DELETE /v1/models/{model}
Deletes a model by name (the fine_tuned_model value or job id). Same effect as deleting the job; OpenAI Models-API parity.
// → 200
{ "id": "support-classifier", "object": "model", "deleted": true }

Reference

FieldTypeDescription
idstringJob id, prefixed ftjob-.
objectstringAlways fine_tuning.job.
modelstringAlways morph-reflex-v1.
base_modelstring / nullThe starting checkpoint supplied at creation, or null for a cold start.
created_atintegerUnix timestamp (seconds) at creation.
finished_atinteger / nullUnix timestamp at terminal state, else null.
fine_tuned_modelstring / nullServed model name once succeeded. The suffix, or the job id if none.
statusstringqueued, running, succeeded, failed, or cancelled.
labelsarrayThe label set used for training.
trained_examplesintegerNumber of training examples.
resultobject / null{ "accuracy", "f1_score" } when succeeded, else null. Each value may be null.
errorobject / null{ "code", "message", "param" } when failed, else null.
suffixstring / nullThe suffix supplied at creation, or null.
FieldTypeDescription
idstringEvent id, prefixed ftevent-.
objectstringAlways fine_tuning.job.event.
created_atintegerUnix timestamp (seconds).
levelstringinfo, warn, or error.
messagestringHuman-readable message.
typestringmetrics for per-step loss, message for the terminal event.
dataobject{ "epoch", "step", "train_loss" }.
OpenAI-shaped: { "error": { "message", "type", "param", "code" } }.
StatustypeWhen
401authentication_errorInvalid or missing API key.
400invalid_request_errorValidation failed. param names the offending field.
400invalid_request_errorbase_model could not be used. code is base_model_not_found (no such owned model or default Reflex) or base_model_incompatible (trained on the legacy stack; retrain first).
404invalid_request_errorNot found. code is job_not_found or model_not_found.
409invalid_request_errorNot ready. code is model_not_ready, or base_model_not_ready when the starting checkpoint has no trained version yet.