Getting Started

Stimulir Documentation

Stimulir is an adaptive AI stack for self-improving workflows. The console gives every team three workspaces — Engineering, Lab, and Compute — backed by the HybrIE runtime.

Latest release

v0.1.170 publishes the Python SDK import surface and keeps prompt, data, and eval paths scriptable.

The current console release makes from stimulir import StimulirClient work from application code, keeps CLI installs separate from SDK dependencies, and connects curated assets and prompt versions to Lab evaluation runs before promotion.

Two API surfaces

Stimulir exposes two complementary APIs:

Console platform API — https://api.stimulir.com. Manages workspaces, API keys, BYOK credentials, usage, and billing, and serves OpenAI-compatible inference at /api/v1/inference/chat/completions. Platform endpoints authenticate with your session token; inference authenticates with hyb_* API keys. See Authentication & Workspaces.
HybrIE runtime API — the engine the Lab and Compute workspaces control. It serves an OpenAI-compatible HTTP API on port 8080 (gRPC on 9090) with local inference (Qwen3 / Qwen3-Coder via Candle on Metal or CUDA), training, evaluation, adapters, and compute orchestration. In BYOC deployments you run this runtime on your own nodes.

Latest capabilities

Prompt management

Create versions, read by key and label, update metadata, archive versions, and preserve lineage for evals.

Data asset hub

Upload, ingest from traces, update metadata, bulk-stage, unstage, snapshot, and hand off curated assets to Lab.

Lab evaluation runs

Run prompt, data, endpoint, adapter, RL, and NIAH evaluations with durable run records and reports.

Python SDK

Add the SDK with uv and use StimulirClient for prompts, data assets, eval runs, and capabilities.

CLI paths

Run prompt, data, inference, training, adapter, eval, and compute workflows from the terminal.

Explore the docs

Engineering Workspace

OpenAI-compatible inference, API keys, BYOK credentials, usage metering, prompt versions, and data assets.

Lab Workspace

PEFT LoRA training, Doc-to-LoRA context internalization, durable eval runs, and hot-swap adapter serving.

Compute Workspace

GPU offers and instances, worker registration, and edge deployment across local, hybrid, and P2P modes.

CLI

Install the terminal command with uv tool, then run prompt, data, eval, inference, and compute workflows.

Next steps

Follow the Quickstart to install the CLI, create an API key, and make your first inference call.
Use the Python SDK or prompts and data assets commands to seed client prompts, ingest traces, stage datasets, and create Lab-ready snapshots.
Run Lab evals against prompt versions, staged data, inference endpoints, adapters, and RL policies before promotion.

v0.1.170 publishes the Python SDK import surface and keeps prompt, data, and eval paths scriptable.

Two API surfaces#

Latest capabilities#

Explore the docs#

Next steps#

Two API surfaces

Latest capabilities

Explore the docs

Next steps