From b57f3d3973bda37de088b67121a66bf5fedbed74 Mon Sep 17 00:00:00 2001 From: Guillaume ARM Date: Tue, 9 Jun 2026 18:08:48 +0200 Subject: [PATCH] feat(craftos): add safe headless exec recipes --- CLAUDE.md | 2 +- Justfile | 147 +++++++++++++++++- PLAN.md | 6 +- docs/adrs/adr-0005-craftos-pc-harness.md | 2 +- ...headless-craftos-pc-as-hypothesis-probe.md | 23 +-- docs/install-craftos-pc.md | 12 +- 6 files changed, 171 insertions(+), 21 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 0b1fe05..85f19d2 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -11,7 +11,7 @@ Use [`docs/README.md`](docs/README.md) as the entrypoint for CC:Tweaked, CraftOS ## Constraints - Do not add a standalone Lua test harness unless asked. Local execution happens through the CraftOS-PC harness (see [`docs/install-craftos-pc.md`](docs/install-craftos-pc.md), [`docs/craftos_pc_glossary.md`](docs/craftos_pc_glossary.md), and [ADR-0005](docs/adrs/adr-0005-craftos-pc-harness.md)); code otherwise executes in-game. -- Do not run `just repl` as an LLM agent; it is a human-only interactive CraftOS-PC wrapper. Use `just trapos --headless --exec '; os.shutdown()'` for automated probes against the TrapOS dev environment, or `just craftos --headless --exec '; os.shutdown()'` for probes against vanilla CraftOS (no TrapOS mounts). Headless probes are the recommended way to verify hypotheses about CC:Tweaked behavior; see [ADR-0012](docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md). +- Do not run `just repl` as an LLM agent; it is a human-only interactive CraftOS-PC wrapper. Use `just trapos-exec ''` for automated probes against the TrapOS dev environment, or `just craftos-exec ''` for probes against vanilla CraftOS (no TrapOS mounts). These wrappers shut down the machine and include a host watchdog. Headless probes are the recommended way to verify hypotheses about CC:Tweaked behavior; see [ADR-0012](docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md). - When changing behavior, add as many useful CraftOS-PC tests as practical. It is acceptable to skip tests that require human-only validation, such as complex turtle motion, in-game UX feel, or visual approval, but still add unit-style non-regression tests for deterministic parts when possible. - Use `/apis/libtest.lua` for test scripts under `tests/`; `/programs/runtest.lua` prints `__TRAPOS_TEST_OK__` only after the suite passes. - `libtest` cancels each case after `3`s (`--timeout ` / `--no-timeout` to override); never commit a hanging test to `tests/`. Slow harness fixtures go in `tests/harness/` behind dedicated recipes. See [`docs/adrs/adr-0009-layered-test-timeouts.md`](docs/adrs/adr-0009-layered-test-timeouts.md). diff --git a/Justfile b/Justfile index 55affa3..6236ac8 100644 --- a/Justfile +++ b/Justfile @@ -112,8 +112,8 @@ generate-env: done < .env.test > .env printf '%s\n' 'Generated .env' -# Pass args through to `craftos`, for example: -# just trapos --headless --exec 'print("__TRAPOS_TEST_OK__"); os.shutdown()' +# Pass args through to `craftos`. Prefer `just trapos-exec ''` for +# automated probes that must not hang the terminal. # Launch the TrapOS dev environment in CraftOS-PC with repo-local data # (.craftos/) and read-only repo mounts. See ADR-0005 and ADR-0012. [positional-arguments] @@ -148,6 +148,149 @@ craftos *args: check-install fi exec craftos "${argv[@]}" "$@" +# Safely run a Lua snippet in the TrapOS dev environment. The wrapper always +# shuts the machine down after normal completion or Lua errors, while the host +# watchdog catches snippets that block before reaching shutdown. +[positional-arguments] +trapos-exec code: + #!/usr/bin/env bash + set -uo pipefail + repo='{{justfile_directory()}}' + timeout_seconds="${TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS:-10}" + case "$timeout_seconds" in ''|*[!0-9]*) printf '%s\n' 'TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS must be a positive integer' >&2; exit 1 ;; esac + if [ "$timeout_seconds" -lt 1 ]; then printf '%s\n' 'TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS must be >= 1' >&2; exit 1; fi + rom_arg=() + if [ "$(uname -s)" = "Darwin" ]; then + rom_arg=(--rom /Applications/CraftOS-PC.app/Contents/Resources) + fi + data_dir="$(mktemp -d)" + stage_dir="$(mktemp -d)" + tmp="$(mktemp)" + output_path="$data_dir/computer/0/headless-output" + status_path="$data_dir/computer/0/headless-status" + runner="$stage_dir/exec.lua" + { + printf '%s\n' 'local output = fs.open("/headless-output", "w")' + printf '%s\n' 'local function emitLine(...)' + printf '%s\n' ' for i = 1, select("#", ...) do' + printf '%s\n' ' if i > 1 then output.write("\t") end' + printf '%s\n' ' output.write(tostring(select(i, ...)))' + printf '%s\n' ' end' + printf '%s\n' ' output.write("\n")' + printf '%s\n' 'end' + printf '%s\n' 'print = emitLine' + printf '%s\n' 'write = function(value) output.write(tostring(value)); end' + printf '%s\n' 'local function main()' + printf '%s\n' "$1" + printf '%s\n' 'end' + printf '%s\n' 'local ok, err = xpcall(main, debug.traceback)' + printf '%s\n' 'if not ok then' + printf '%s\n' ' output.writeLine(err)' + printf '%s\n' 'end' + printf '%s\n' 'output.close()' + printf '%s\n' 'local status = fs.open("/headless-status", "w")' + printf '%s\n' 'status.writeLine(ok and "OK" or "FAIL")' + printf '%s\n' 'status.close()' + printf '%s\n' 'os.shutdown()' + } > "$runner" + mount_arg=(--mount-ro "/trapos=$repo" --mount-ro "/apis=$repo/apis" --mount-ro "/programs=$repo/programs" --mount-ro "/servers=$repo/servers" --mount-ro "/startup=$repo/startup" --mount-ro "/tests=$repo/tests" --mount-ro "/headless=$stage_dir") + craftos --directory "$data_dir" --headless "${rom_arg[@]}" "${mount_arg[@]}" --exec "shell.run('/headless/exec.lua')" >"$tmp" 2>&1 & + pid="$!" + ( sleep "$timeout_seconds"; kill -TERM "$pid" >/dev/null 2>&1 ) & + watchdog="$!" + wait "$pid" >/dev/null 2>&1 + status="$?" + kill "$watchdog" >/dev/null 2>&1 || true + wait "$watchdog" >/dev/null 2>&1 || true + red=$(printf '\033[31m'); reset=$(printf '\033[0m') + if [ -f "$status_path" ] && grep -q '^OK$' "$status_path"; then + if [ -f "$output_path" ]; then cat "$output_path"; fi + rm -f "$tmp"; rm -rf "$data_dir"; rm -rf "$stage_dir" + else + if [ "$status" -eq 143 ]; then + printf '%s\n' "${red}FAIL${reset} TrapOS headless exec timed out after ${timeout_seconds}s" >&2 + else + printf '%s\n' "${red}FAIL${reset} TrapOS headless exec failed" >&2 + fi + if [ -f "$output_path" ]; then + cat "$output_path" >&2 + else + cat "$tmp" >&2 + fi + rm -f "$tmp"; rm -rf "$data_dir"; rm -rf "$stage_dir" + exit 1 + fi + +# Safely run a Lua snippet in vanilla CraftOS-PC with no TrapOS mounts. +[positional-arguments] +craftos-exec code: + #!/usr/bin/env bash + set -uo pipefail + repo='{{justfile_directory()}}' + timeout_seconds="${TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS:-10}" + case "$timeout_seconds" in ''|*[!0-9]*) printf '%s\n' 'TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS must be a positive integer' >&2; exit 1 ;; esac + if [ "$timeout_seconds" -lt 1 ]; then printf '%s\n' 'TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS must be >= 1' >&2; exit 1; fi + rom_arg=() + if [ "$(uname -s)" = "Darwin" ]; then + rom_arg=(--rom /Applications/CraftOS-PC.app/Contents/Resources) + fi + data_dir="$(mktemp -d)" + stage_dir="$(mktemp -d)" + tmp="$(mktemp)" + output_path="$data_dir/computer/0/headless-output" + status_path="$data_dir/computer/0/headless-status" + runner="$stage_dir/exec.lua" + { + printf '%s\n' 'local output = fs.open("/headless-output", "w")' + printf '%s\n' 'local function emitLine(...)' + printf '%s\n' ' for i = 1, select("#", ...) do' + printf '%s\n' ' if i > 1 then output.write("\t") end' + printf '%s\n' ' output.write(tostring(select(i, ...)))' + printf '%s\n' ' end' + printf '%s\n' ' output.write("\n")' + printf '%s\n' 'end' + printf '%s\n' 'print = emitLine' + printf '%s\n' 'write = function(value) output.write(tostring(value)); end' + printf '%s\n' 'local function main()' + printf '%s\n' "$1" + printf '%s\n' 'end' + printf '%s\n' 'local ok, err = xpcall(main, debug.traceback)' + printf '%s\n' 'if not ok then' + printf '%s\n' ' output.writeLine(err)' + printf '%s\n' 'end' + printf '%s\n' 'output.close()' + printf '%s\n' 'local status = fs.open("/headless-status", "w")' + printf '%s\n' 'status.writeLine(ok and "OK" or "FAIL")' + printf '%s\n' 'status.close()' + printf '%s\n' 'os.shutdown()' + } > "$runner" + craftos --directory "$data_dir" --headless "${rom_arg[@]}" --mount-ro "/headless=$stage_dir" --exec "shell.run('/headless/exec.lua')" >"$tmp" 2>&1 & + pid="$!" + ( sleep "$timeout_seconds"; kill -TERM "$pid" >/dev/null 2>&1 ) & + watchdog="$!" + wait "$pid" >/dev/null 2>&1 + status="$?" + kill "$watchdog" >/dev/null 2>&1 || true + wait "$watchdog" >/dev/null 2>&1 || true + red=$(printf '\033[31m'); reset=$(printf '\033[0m') + if [ -f "$status_path" ] && grep -q '^OK$' "$status_path"; then + if [ -f "$output_path" ]; then cat "$output_path"; fi + rm -f "$tmp"; rm -rf "$data_dir"; rm -rf "$stage_dir" + else + if [ "$status" -eq 143 ]; then + printf '%s\n' "${red}FAIL${reset} CraftOS headless exec timed out after ${timeout_seconds}s" >&2 + else + printf '%s\n' "${red}FAIL${reset} CraftOS headless exec failed" >&2 + fi + if [ -f "$output_path" ]; then + cat "$output_path" >&2 + else + cat "$tmp" >&2 + fi + rm -f "$tmp"; rm -rf "$data_dir"; rm -rf "$stage_dir" + exit 1 + fi + # End-to-end install probe: drive the real ccpm bootstrap # (install-ccpm.lua -> `ccpm update` -> `ccpm install trapos`) on a fresh, # ephemeral CraftOS-PC state. Reflects the currently checked-out git branch: diff --git a/PLAN.md b/PLAN.md index 296a4bc..a98da75 100644 --- a/PLAN.md +++ b/PLAN.md @@ -136,14 +136,14 @@ Apres implementation: ```text just check -just trapos --headless --exec 'shell.run("/programs/carre", "-size", "5", "-clear"); os.shutdown()' -just trapos --headless --exec 'shell.run("/programs/carre", "-random", "-count", "3", "-delay", "0"); os.shutdown()' +just trapos-exec 'shell.run("/programs/carre", "-size", "5", "-clear")' +just trapos-exec 'shell.run("/programs/carre", "-random", "-count", "3", "-delay", "0")' ``` Si des tests sont ajoutes: ```text -just trapos --headless --exec 'shell.run("/programs/runtest", "/tests/carre_test.lua"); os.shutdown()' +just trapos-exec 'shell.run("/programs/runtest", "/tests/carre_test.lua")' ``` ## Packaging diff --git a/docs/adrs/adr-0005-craftos-pc-harness.md b/docs/adrs/adr-0005-craftos-pc-harness.md index 687838a..74f2811 100644 --- a/docs/adrs/adr-0005-craftos-pc-harness.md +++ b/docs/adrs/adr-0005-craftos-pc-harness.md @@ -41,7 +41,7 @@ The existing [`CLAUDE.md`](../../CLAUDE.md) constraint ("Do not run Lua locally - Each test process is guarded by `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS`, defaulting to `3`, so a blocked ComputerCraft event loop fails quickly and prints captured output. - The macOS install symlinks the binary into `/usr/local/bin`, which makes CraftOS-PC unable to auto-discover the ROM that ships inside the `.app` bundle (`Could not mount ROM`). The `test:` recipe works around this by passing `--rom /Applications/CraftOS-PC.app/Contents/Resources` on Darwin. Linux (AppImage) and Windows (installer) auto-discover correctly, so no flag is passed there. - `just trapos` uses repository-local save data under `.craftos/config/` and `.craftos/computer/`. This keeps emulator state out of `~/Library/Application Support/CraftOS-PC` during repository work and keeps repo files visible through read-only mounts instead of copying them into the VM save. -- `just repl` is a human-only interactive wrapper around `just trapos --cli`; automation and LLM agents must use headless `just trapos --headless --exec '; os.shutdown()'` (TrapOS dev env) or `just craftos --headless --exec '; os.shutdown()'` (vanilla emulator) invocations instead. [ADR-0012](adr-0012-headless-craftos-pc-as-hypothesis-probe.md) frames these as the canonical hypothesis-probe pattern. +- `just repl` is a human-only interactive wrapper around `just trapos --cli`; automation and LLM agents must use `just trapos-exec ''` (TrapOS dev env) or `just craftos-exec ''` (vanilla emulator) instead. [ADR-0012](adr-0012-headless-craftos-pc-as-hypothesis-probe.md) frames these as the canonical hypothesis-probe pattern. - The harness version becomes a project-level concern. When CC:Tweaked ships breaking changes that require a newer CraftOS-PC build, we bump the minimum version in [`docs/install-craftos-pc.md`](../install-craftos-pc.md) and `check-craftos` keeps contributors honest. - No CI integration yet. Running CraftOS-PC headless in GitHub Actions is feasible (the AppImage works on Ubuntu runners) but is out of scope here; the contract is local-only for now. diff --git a/docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md b/docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md index 5833a75..d5ced07 100644 --- a/docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md +++ b/docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md @@ -15,15 +15,15 @@ Accepted [ADR-0008](adr-0008-keep-tests-runnable-in-craftos-and-in-game.md) / [ADR-0009](adr-0009-layered-test-timeouts.md) wired it into the test suite via `just test`. That work focused on the *test* path. Headless CraftOS-PC is also a cheap, deterministic -*interactive* tool: `craftos --headless --exec '; os.shutdown()'` boots the emulator, -runs an arbitrary Lua snippet against the real CC:Tweaked ROM, prints output to stdout, -and exits in well under a second. Humans and LLM agents can use it to verify hypotheses +*interactive* tool: `just craftos-exec ''` boots the emulator, runs an arbitrary +Lua snippet against the real CC:Tweaked ROM, prints output to stdout, and exits in well +under a second. Humans and LLM agents can use it to verify hypotheses about CC:Tweaked behavior *before* writing code or tests — "does `os.epoch('utc')` return ms?", "does my new API factory `require` cleanly?", "does `fs.exists` follow symlinks inside `--mount-ro`?". Today this usage was implicit: the harness existed, but no document framed -`--headless --exec '...'` as the recommended first move when an agent is unsure about +safe headless exec recipes as the recommended first move when an agent is unsure about CC:Tweaked behavior. The original recipe was also named `just craftos` even though it mounted the entire TrapOS dev environment — so probes against it were never against vanilla CC:Tweaked, even when the agent thought they were. @@ -40,14 +40,15 @@ Two concrete changes triggered this ADR: ## Decision -Frame headless CraftOS-PC as the canonical hypothesis-probe pattern, with two flavors: +Frame headless CraftOS-PC as the canonical hypothesis-probe pattern, with two safe +exec flavors: -- `just trapos --headless --exec '; os.shutdown()'` — probe against the **TrapOS dev +- `just trapos-exec ''` — probe against the **TrapOS dev environment**. Mounts of `/apis`, `/programs`, `/servers`, `/startup`, `/tests`, and the repo root at `/trapos` are live, so `require('/apis/eventloop')` and friends work against the current branch. Use this when the question involves repo code. -- `just craftos --headless --exec '; os.shutdown()'` — probe against **vanilla +- `just craftos-exec ''` — probe against **vanilla CraftOS-PC**. No mounts, no startup scripts. Use this when the question is purely about CC:Tweaked behavior and TrapOS files would be a distraction, or to confirm a behavior is upstream rather than something the dev env layered on. @@ -58,9 +59,9 @@ Frame headless CraftOS-PC as the canonical hypothesis-probe pattern, with two fl Conventions: -- Always terminate the snippet with `os.shutdown()`. The shell watchdog from - [ADR-0009](adr-0009-layered-test-timeouts.md) governs `just test`, not these recipes; - a missing shutdown will hang until the user kills the process. +- Prefer the safe exec recipes over raw `--headless --exec`. They wrap snippets with + `xpcall`, call `os.shutdown()` on success or Lua error, and use + `TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS` (default `10`) as a host watchdog for true hangs. - Keep snippets minimal and side-effect-free. If the probe reveals a fact worth defending, add a `libtest` case under `tests/` — probes are not a substitute for committed tests. - LLM agents SHOULD prefer a quick headless probe over speculation when answering @@ -73,7 +74,7 @@ Conventions: good trade. - Faster convergence on correct fixes: agents stop committing speculative changes that pass `luacheck` but fail in-game. -- A named pattern (`--headless --exec '; os.shutdown()'`) shows up in [`CLAUDE.md`](../../CLAUDE.md) and +- A named pattern (`just trapos-exec ''` / `just craftos-exec ''`) shows up in [`CLAUDE.md`](../../CLAUDE.md) and [`docs/install-craftos-pc.md`](../install-craftos-pc.md), so contributors and agents reach for it without rediscovery. - `.craftos-vanilla/` is added to `.gitignore` alongside `.craftos/`. - `just trapos-install` is *not* part of `just ci`: it is network-dependent and slower diff --git a/docs/install-craftos-pc.md b/docs/install-craftos-pc.md index fded105..932d82b 100644 --- a/docs/install-craftos-pc.md +++ b/docs/install-craftos-pc.md @@ -92,13 +92,19 @@ On macOS, use the `--rom` form shown above if the command fails with `Could not `just trapos-install` exercises the real ccpm bootstrap (`install-ccpm.lua` → `ccpm update` → `ccpm install trapos`) end-to-end on a fresh, ephemeral CraftOS-PC state. Network-dependent and slower than `just test`, so not part of `just ci`. Override the watchdog with `TRAP_CCLIBS_INSTALL_TIMEOUT_SECONDS` (default `60`). -Pass CraftOS-PC flags directly after the recipe name, for example: +For automated probes, use the safe wrappers. They mount the right environment, +shut down the machine after completion or Lua errors, and kill the host process +if the snippet blocks before shutdown: ```sh -just trapos --headless --exec 'print("__TRAPOS_TEST_OK__"); os.shutdown()' -just craftos --headless --exec 'print(_HOST); os.shutdown()' +just trapos-exec 'print("__TRAPOS_TEST_OK__")' +just craftos-exec 'print(_HOST)' ``` +Override the watchdog with `TRAP_CCLIBS_HEADLESS_TIMEOUT_SECONDS` (default `10`). +Pass CraftOS-PC flags directly after `just trapos` or `just craftos` only for +manual launches where you want raw emulator control. + See [`docs/adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md`](adrs/adr-0012-headless-craftos-pc-as-hypothesis-probe.md) for the canonical headless probe pattern used to verify hypotheses about CC:Tweaked behavior. `just repl` delegates to `just trapos --cli` for human interactive use only. LLM agents must not run `just repl`.