Lab Workspace

Training Jobs

List, inspect, and cancel asynchronous training runs on the HybrIE runtime.

List jobs

GET/v1/train/jobs

Returns all asynchronous jobs known to the runtime — PEFT LoRA / RL runs (rl-grpo), supervised PEFT LoRA fine-tuning (sft-peft-lora), Doc-to-LoRA hypernetwork training (sft-d2l), model staging, and worker bootstrap jobs.

Get a job

GET/v1/train/jobs/:id

Returns the job's status and its progress points — a series of step / loss / reward samples you can plot to watch a run converge. For SFT PEFT-LoRA jobs the reward field carries the cross-entropy term rather than an environment reward.

curl
curl http://localhost:8080/v1/train/jobs/train-1718102400

Cancel a job

DELETE/v1/train/jobs/:id

Stops the run; checkpoints already written are kept. Cancellation applies to RL jobs and SFT PEFT-LoRA jobs (rl-grpo, sft-peft-lora) — partial checkpoints are preserved. Doc-to-LoRA hypernetwork jobs (sft-d2l) still run to completion, and the runtime responds with an error if you try to cancel one.

From the CLI

bash
stimulir lab jobs list
stimulir lab jobs get <job-id>
stimulir lab jobs cancel <job-id>