Motivation

Training a strong coding agent is a multi-stage pipeline — curate tasks, roll out trajectories, do SFT, then RL. Each stage needs its own data, its own environment, and its own GPUs, and the handoffs between them quietly break every time someone runs them by hand.

We built SWE-Lego-Live so the whole pipeline behaves like one project. Every stage is a self-contained block with the same shape — a config.yaml, a scripts/start.sh, an artifacts/ tree — wired into a parent by name. Run one block, run the whole tree, or swap a block out; the contracts hold.

GitHub PRs ─▶ swegen ─▶ trajgen ─▶ sft ─▶ rl

SWE-Lego-Live provides:

A block abstraction that makes every pipeline stage swappable, restartable, and replayable
swegen — curated SWE tasks generated and NOP/Oracle-verified directly from GitHub PRs
trajgen — containerized agent rollouts at scale on Harbor, with pluggable agent scaffolds
sft — supervised fine-tuning on the resulting trajectories (LLaMA-Factory + DeepSpeed ZeRO-3)
rl — online RL on SWE-bench through Harbor + vLLM + verl
Per-run archives with config, scripts, and repo SHAs frozen for every execution
A Claude Code plugin (/root:check, /root:run, /root:create) that drives the tree end-to-end

Where to go next

Getting Started — the three skills (:setup, :check, :run) used to drive every block
Block Design — the abstraction the whole pipeline is built on
Blocks: swegen · trajgen · sft · rl — each links out to that block's full docs site
Reference — config.yaml schema and the archive format

Motivation

Where to go next

On this page