# ADR 0009: Layered Test Timeouts ## Status Accepted ## Date 2026-06-08 ## Context ADR 0005 made CraftOS-PC the local harness and ADR 0007 split `libtest` (cases and assertions) from `runtest` (suite orchestration). The only timeout in that harness was the shell watchdog in the `Justfile` `test:` recipe: it `kill -TERM`s the whole CraftOS-PC process after `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS`. That single layer is coarse. It kills the entire process, so it cannot say *which* case hung, it produces one generic message, and it cannot tell a cooperatively-blocked event loop (the common failure: waiting on an event or `sleep` that never resolves) apart from a genuinely wedged process. A per-case timeout inside Lua is both finer and faster, but the shell watchdog is still needed for the cases Lua cannot interrupt. ## Decision Run two independent timeout layers, ordered so the finer one fires first. **Layer 1 — `libtest` per-case timeout (primary).** `/apis/libtest.lua` races each test case against a timer with `parallel.waitForAny(runner, timer)`. The default is `DEFAULT_TIMEOUT_SECONDS = 3`. When the timer wins, the case fails with a distinct message containing the token `libtest timeout` and, in `--verbose`, an extra `TIMEOUT … (libtest)` diagnostic. `--timeout ` overrides the default; `--no-timeout` disables the layer. `/programs/runtest.lua` forwards both flags to each case script. This only interrupts cases that yield (the usual hang); a non-yielding CPU loop cannot be preempted in ComputerCraft. **Layer 2 — shell watchdog (backstop).** The `Justfile` `test:` recipe keeps its existing `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS` watchdog unchanged, as an independent double-check. Its default sits *above* the libtest default (`.env.sample` ships `7`; the recipe falls back to `7`) so libtest fires first in normal runs and the watchdog only catches what Lua cannot — a non-yielding loop, a wedged libtest, or a deliberately bypassed case. Its SIGTERM message is worded differently from the `libtest timeout` message, so the two layers are never confused. ## How To Write Tests Properly - Normal tests live in `tests/*.lua`, use `/apis/libtest.lua`, and must finish under the libtest timeout. `runtest` auto-discovers them; `just test` runs the suite. - Never commit a hanging or intentionally-slow test to `tests/`: it would fail every run. - Intentionally-slow fixtures that exercise the harness itself live in `tests/harness/`. `runtest` discovery skips subdirectories, so they never run with the normal suite; they are driven only by dedicated recipes (`just test-timeout-5s`, `just test-timeout-10s`). - Use `--no-timeout` only for harness fixtures that must outlive the libtest layer to prove the shell watchdog, never for ordinary tests. ## Consequences - A hung case now fails in ~3s with a per-case message instead of taking down the whole process anonymously. - The two `test-timeout-*` recipes are self-asserting harness tests: `test-timeout-5s` proves Layer 1 (libtest cancels a 5s case before the watchdog), `test-timeout-10s` proves Layer 2 (the watchdog kills a 10s case with libtest bypassed). They are intentionally excluded from `ci`/`test`. - `libtest` stays a normal ComputerCraft program: `parallel` and `sleep` are sandbox globals, so the timeout works in CraftOS-PC and in-game alike. ## Future Work - Per-case timing in `--verbose` output if slow-but-passing cases become hard to spot. - A `libtest`-level marker for "expected timeout" if more harness fixtures appear.