Skip to main content

Reflex status flow (read this before touching status code)

Reflex follows OpenAI fine-tuning conventions, so a job’s internal status is not the same as the OpenAI-compatible value the API returns — and there are two rows per workspace. Those two facts cause ~all of the status confusion. Keep both in your head.

Two rows per workspace

RowWhat it isIts status is…
Chat rowthe user-facing workspace (reflex_chats, has a title, source='dashboard')the dashboard status the UI renders
Job rowthe spawned fine-tuning job — also a reflex_chats row (title=NULL, source='api'), referenced by the chat’s latestJobIdthe backend/internal status
The UI only ever renders the chat row. So a status like prepared shows up only on a job row — the user never sees it. (Gotcha: reflex_training_queue.chat_id stores the job uuid, not the chat id.)

Three status vocabularies

  • Dashboard (chat row, what the UI keys off): chatting · generating · queued · training · ready · error
  • Internal / backend (job row): queued · preparing · generating · labeling · prepared · training · ready · error · stopped
  • OpenAI-compatible (what GET /v1/fine_tuning/jobs/{id} returns): validating_files · queued · running · succeeded · failed · cancelled

The intended lifecycle

chatting → generating → pending-approval → queued → training → ready
#StepDashboardInternal (job)OpenAI-compatWhat the user sees
1chattingchattingthe conversation / building the dataset
2generatinggeneratingpreparing / generating / labelingvalidating_files”Generating… / Relabeling…” progress
3pending approvalchatting ⚠️preparedqueuedthe review grid — approve / edit the rows
4queuedqueuedqueuedqueued”queued to train”
5trainingtrainingtrainingrunningtraining progress
6readyreadyreadysucceededmodel card / playground
error / stopped(→cancelled) are terminal off-paths from any active step.

The two lossy collapses (this is the trap)

tab/reflex api/wire.py::map_reflex_status maps internal → OpenAI-compatible, and it’s lossy:
  • preparing / generating / labelingvalidating_files (can’t tell which)
  • preparedqueued (so step 3 and step 4 both read queued)
  • training → running, ready → succeeded, error → failed, stopped → cancelled
Because OpenAI queued means either “prepared / pending approval” (step 3) or “queued to train” (step 4), the dashboard can’t trust the raw OpenAI value. The seam that disambiguates is src/lib/reflex/job-status.ts::resolveJobChatStatus:
OpenAI queued + no active training-queue row + review-gated (auto_train:false, not approved) + getJobDataset has rows ⇒ step 3 (prepared) ⇒ dashboard chatting (review grid). An active queue row ⇒ step 4 (queued/training).
Every status writer must go through resolveJobChatStatus, not jobStatusToChatStatus directly — bypassing it caused the #390 and #418 regressions.

Known smell: “pending approval” is not a first-class status

Step 3 isn’t its own dashboard status — it’s inferred as chatting + an unapproved prepared dataset. So chatting is overloaded: it means both “just talking” and “review-ready.” This works today (don’t rush to change it), but it’s why the review step is easy to mistake for “still going.” If we ever make it first-class, the seam is already there: check_status returns an explicit phase: 'awaiting_review'.

label_data (relabel) notes

  • Teacher-labeling runs through the OpenAI Batch API (completion_window: "24h", api/openai_batch.py) — minutes to hours, no synchronous path. That’s why fresh relabels are slow to test.
  • A relabel reaches step 3 correctly (chat chatting, job prepared). The review grid for relabel is wired via LabelDataCard (it was previously only wired for generate_data / map_upload), and check_status reports a clear phase instead of leaking the raw OpenAI queued + phantom epochs.