Getting Started
Stimulir Documentation
Stimulir is an adaptive AI stack for self-improving workflows. The console gives every team three workspaces — Engineering, Lab, and Compute — backed by the HybrIE runtime.
Latest release
v0.1.170 publishes the Python SDK import surface and keeps prompt, data, and eval paths scriptable.
The current console release makes from stimulir import StimulirClient work from application code, keeps CLI installs separate from SDK dependencies, and connects curated assets and prompt versions to Lab evaluation runs before promotion.
Two API surfaces
Stimulir exposes two complementary APIs:
- Console platform API —
https://api.stimulir.com. Manages workspaces, API keys, BYOK credentials, usage, and billing, and serves OpenAI-compatible inference at/api/v1/inference/chat/completions. Platform endpoints authenticate with your session token; inference authenticates withhyb_*API keys. See Authentication & Workspaces. - HybrIE runtime API — the engine the Lab and Compute workspaces control. It serves an OpenAI-compatible HTTP API on port
8080(gRPC on9090) with local inference (Qwen3 / Qwen3-Coder via Candle on Metal or CUDA), training, evaluation, adapters, and compute orchestration. In BYOC deployments you run this runtime on your own nodes.
Latest capabilities
Prompt management
Create versions, read by key and label, update metadata, archive versions, and preserve lineage for evals.
Data asset hub
Upload, ingest from traces, update metadata, bulk-stage, unstage, snapshot, and hand off curated assets to Lab.
Lab evaluation runs
Run prompt, data, endpoint, adapter, RL, and NIAH evaluations with durable run records and reports.
Python SDK
Add the SDK with uv and use StimulirClient for prompts, data assets, eval runs, and capabilities.
CLI paths
Run prompt, data, inference, training, adapter, eval, and compute workflows from the terminal.
Explore the docs
Engineering Workspace
OpenAI-compatible inference, API keys, BYOK credentials, usage metering, prompt versions, and data assets.
Lab Workspace
PEFT LoRA training, Doc-to-LoRA context internalization, durable eval runs, and hot-swap adapter serving.
Compute Workspace
GPU offers and instances, worker registration, and edge deployment across local, hybrid, and P2P modes.
CLI
Install the terminal command with uv tool, then run prompt, data, eval, inference, and compute workflows.
Next steps
- Follow the Quickstart to install the CLI, create an API key, and make your first inference call.
- Use the Python SDK or prompts and data assets commands to seed client prompts, ingest traces, stage datasets, and create Lab-ready snapshots.
- Run Lab evals against prompt versions, staged data, inference endpoints, adapters, and RL policies before promotion.
