Getting Started
Three skills to operate any block — setup, check, run
Every block in SWE-Lego-Live is operated through the same three Claude Code skills, whether it's data curation, trajectory generation, SFT, or RL:
| Skill | What it does | Side effects |
|---|---|---|
/<block>:setup | Install dependencies, build envs, fill default config values | Writes files; idempotent |
/<block>:check | Validate config, inputs, environments, and live endpoints | Read-only |
/<block>:run | Execute the block end-to-end and archive the run | Long-running; modifies artifacts |
Replace <block> with the block name — swegen, trajgen, sft, or rl. Same three verbs, every block.
A short note on blocks
A block is a self-contained directory that owns one stage of the pipeline. Every block has the same shape — config.yaml (inputs, outputs, dependencies), scripts/ (setup, dryrun, start, archive), artifacts/ (live state and per-run archives), and a CLAUDE.md that documents the agent contract. The parent block (this repo root) wires children together by name through meta_info.subblocks[].dependencies, so values flow from one block to the next without hand-copied paths.
For the full design — schema, dependency rules, archive format, the root plugin — see Block Design.
Prerequisites
- Claude Code with this repo's plugin loaded —
/reload-pluginsshould show1 plugin · 3 skills. - Docker on whichever host runs
swegenandtrajgen(containerized rollouts). - 8× GPU node for
sftandrl. - An OpenAI-compatible LLM endpoint with an API key —
swegenandtrajgenboth call it. - GitHub token(s) with
reporead scope, for PR collection inswegen.
1. Clone
git clone --recurse-submodules <repo-url> SWE-Lego-Live
cd SWE-Lego-LiveIf you cloned without --recurse-submodules, run git submodule update --init --recursive.
2. /<block>:setup — install dependencies
Run setup once per block, in pipeline order:
/swegen:setup # clone repos, build envs, prepare default config
/trajgen:setup
/sft:setup
/rl:setupEach :setup is idempotent and safe to re-run — it only fills in fields that aren't already set, and only builds environments that don't already exist. After setup, the block's config.yaml will still have a handful of runtime_info.input fields that only you can provide (API keys, GitHub tokens, model names). Fill those by hand.
3. /<block>:check — preflight
<block>:check is read-only. It validates the schema, every filled input, every inter-block dependency, repo pins, environment paths, and live LLM endpoints (via a no-cost GET /models probe). It never modifies anything and never costs anything.
/swegen:check
/trajgen:check
/sft:check
/rl:checkRun :check after every config edit. It tells you exactly which input is unfilled, which submodule is missing, and whether the remote node is reachable, so you can fix gaps before committing to a long run.
4. /<block>:run — execute
Only run after the matching :check passes:
/swegen:run # generate and verify SWE tasks
/trajgen:run # roll the agent out across verified tasks
/sft:run # convert trajectories + SFT
/rl:run # online RL from the SFT checkpointEach :run executes the block's scripts/start.sh. Long-running operations (multi-hour GPU training, multi-container rollouts) launch in tmux so they survive disconnects. Every run is archived automatically — start.sh installs an EXIT trap that fires archive_run.sh regardless of how the run exits (success, error, SIGINT, SIGTERM). One entry lands in artifacts/index.yaml, and the snapshot lives in artifacts/archives/run_NNN/.
5. Read the result
The block's own artifacts are the source of truth:
artifacts/index.yaml— newest entry'sstatus(completed | failed | interrupted)artifacts/archives/run_NNN/metadata.yaml— id, timestamps, exit code, repo SHAsartifacts/archives/run_NNN/config.yaml— the frozen config that produced this run
For visual progress, every block also ships a :dashboard skill — e.g. /trajgen:dashboard for trajectory rollouts, /sft:dashboard for training loss. See each block's own docs under Sub-block.
Working at the root
/root:check and /root:run recursively apply the same lifecycle across the whole tree, with a free-form argument that selects the target block:
/root:check # check root + every subblock
/root:check swegen # one block at a time
/root:run trajgen # equivalent to /trajgen:run
/root:run start the data pipeline # paraphrased; resolves to the root orchestratorBoth forms — /<block>:check and /root:check <block> — are equivalent. Use whichever reads more naturally for what you're doing.