From 7b19de7945c7051b439faa8c75f79414fe8fbb38 Mon Sep 17 00:00:00 2001 From: Guillaume ARM Date: Mon, 8 Jun 2026 19:22:32 +0200 Subject: [PATCH] test(harness): use test env for timeouts --- .env.sample | 1 - .env.test | 5 +++++ DEVELOPMENT.md | 6 +++--- Justfile | 23 ++++++++------------- docs/adrs/adr-0009-layered-test-timeouts.md | 15 +++++++------- 5 files changed, 25 insertions(+), 25 deletions(-) delete mode 100644 .env.sample create mode 100644 .env.test diff --git a/.env.sample b/.env.sample deleted file mode 100644 index b9a8c81..0000000 --- a/.env.sample +++ /dev/null @@ -1 +0,0 @@ -TRAP_CCLIBS_TEST_TIMEOUT_SECONDS=3 diff --git a/.env.test b/.env.test new file mode 100644 index 0000000..8b8ebb6 --- /dev/null +++ b/.env.test @@ -0,0 +1,5 @@ +# Host-side watchdog for the normal `just test` CraftOS-PC process. +TRAP_CCLIBS_TEST_TIMEOUT_SECONDS=3 + +# Dedicated `just test-timeout` fixture timings. +TRAP_CCLIBS_TEST_TIMEOUT_WATCHDOG_SECONDS=1 diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index a551658..e3e368f 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -12,7 +12,7 @@ After cloning the repository, run: just install ``` -This creates `.env` from `.env.sample` when needed and installs the local Git hooks: pre-commit runs `just test`, and pre-push runs `just ci`. +This installs the local Git hooks: pre-commit runs `just test`, and pre-push runs `just ci`. `just ci` is the full local verification entry point. Today it verifies that `craftos --version` reports v2.8.3 or newer, runs `just check` for `luacheck`, runs `just test` for CraftOS-PC headless tests, and runs the harness timeout regression guard. Use `just craftos` to launch CraftOS-PC with repo-local save data under `.craftos` and read-only mounts for `/trapos`, `/apis`, `/programs`, `/servers`, and `/startup`. `just repl` opens the same environment with `--cli` for human interactive use only; LLM agents must not run it. Use the CraftOS-PC glossary when adjusting `--headless`, `--exec`, `--script`, `--rom`, or `--mount-*` usage. @@ -23,6 +23,6 @@ Tests live under `tests/` and run inside CraftOS-PC through `just test`, which l Test timeouts run in two independent layers (see [`docs/adrs/adr-0009-layered-test-timeouts.md`](docs/adrs/adr-0009-layered-test-timeouts.md)): - **libtest per-case timeout (primary).** Each `libtest` case is cancelled after `3` seconds by default, failing with a distinct `libtest timeout` message. Override with `--timeout `, or disable with `--no-timeout` (both forwarded by `runtest` to each case). This catches a single hung case quickly without taking down the whole run. -- **Shell watchdog (backstop).** The whole CraftOS-PC process is killed if it does not finish within `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS` (`.env.sample` ships `7`; the recipe falls back to `7`). Keep it above the `3`s libtest default so libtest fires first; the watchdog only catches what Lua cannot interrupt. Override in `.env` for slower local probes. +- **Shell watchdog (backstop).** The whole CraftOS-PC process is killed if it does not finish within `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS` (`.env.test` ships `3`; the recipe falls back to `3`). Keep it at or above the `3`s libtest default so libtest fires first; the watchdog only catches what Lua cannot interrupt. Override in your shell for slower local probes. -`just test-timeout` is a self-asserting regression guard for these layers and runs automatically as part of `just ci`. It chains `just test-timeout-lua` (proves libtest cancels a slow case at 0.1s, before a 2s shell backstop) and `just test-timeout-shell` (proves the 1s shell watchdog kills a slow case run with `--no-timeout`). Both drive the `tests/harness/slow-case.lua` fixture, which is never picked up by the normal `just test` suite (`runtest` skips `tests/` subdirectories). +`just test-timeout` is a self-asserting regression guard for these layers and runs automatically as part of `just ci`. It chains `just test-timeout-lua` (proves libtest cancels a slow case immediately with `--timeout 0`, before the shell backstop) and `just test-timeout-shell` (proves the `TRAP_CCLIBS_TEST_TIMEOUT_WATCHDOG_SECONDS` watchdog, default `1`, kills a slow case run with `--no-timeout`). Both drive the `tests/harness/slow-case.lua` fixture, which is never picked up by the normal `just test` suite (`runtest` skips `tests/` subdirectories). diff --git a/Justfile b/Justfile index e192713..f0ffe0a 100644 --- a/Justfile +++ b/Justfile @@ -6,14 +6,7 @@ default: @just --list # Install local development tooling. -install: init-env install-git-hooks check-install - -# Create a local environment file when one does not exist. -init-env: - @if [ ! -f .env ]; then \ - cp .env.sample .env; \ - printf '%s\n' 'Created .env from .env.sample'; \ - fi +install: install-git-hooks check-install # Install Git hooks for this repository. install-git-hooks: @@ -109,10 +102,10 @@ ci *args: check-craftos check # Run CraftOS-PC headless integration tests. Pass `--pretty` for grouped output. test *args: - @if [ -f .env ]; then set -a; . ./.env; set +a; fi; \ + @if [ -f .env.test ]; then set -a; . ./.env.test; set +a; fi; \ pretty=0; \ verbose=0; \ - timeout_seconds="${TRAP_CCLIBS_TEST_TIMEOUT_SECONDS:-7}"; \ + timeout_seconds="${TRAP_CCLIBS_TEST_TIMEOUT_SECONDS:-3}"; \ case "$timeout_seconds" in ''|*[!0-9]*) printf '%s\n' 'TRAP_CCLIBS_TEST_TIMEOUT_SECONDS must be a positive integer' >&2; exit 1 ;; esac; \ if [ "$timeout_seconds" -lt 1 ]; then printf '%s\n' 'TRAP_CCLIBS_TEST_TIMEOUT_SECONDS must be >= 1' >&2; exit 1; fi; \ for a in {{args}}; do case "$a" in --pretty) pretty=1 ;; --verbose|-v) pretty=1; verbose=1 ;; esac; done; \ @@ -163,6 +156,7 @@ _timeout-fixture script shell_timeout extra_flag expect: check-install #!/usr/bin/env bash set -uo pipefail repo='{{justfile_directory()}}' + if [ -f "$repo/.env.test" ]; then set -a; . "$repo/.env.test"; set +a; fi rom_arg="" if [ "$(uname -s)" = "Darwin" ]; then rom_arg="--rom /Applications/CraftOS-PC.app/Contents/Resources" @@ -172,9 +166,10 @@ _timeout-fixture script shell_timeout extra_flag expect: check-install data_dir="$(mktemp -d)" output_path="$data_dir/computer/0/trapos-test-output" exec_code="shell.run('/programs/runtest.lua', '{{script}}', '--verbose', '--output', '/trapos-test-output', {{extra_flag}} '--shutdown')" + shell_timeout="{{shell_timeout}}" craftos --directory "$data_dir" --headless $rom_arg $mount_arg --exec "$exec_code" >"$tmp" 2>&1 & pid="$!" - ( sleep {{shell_timeout}}; kill -TERM "$pid" >/dev/null 2>&1 ) & + ( sleep "$shell_timeout"; kill -TERM "$pid" >/dev/null 2>&1 ) & watchdog="$!" wait "$pid" >/dev/null 2>&1 status="$?" @@ -194,7 +189,7 @@ _timeout-fixture script shell_timeout extra_flag expect: check-install ;; shell) if [ "$status" -eq 143 ]; then - printf '%s\n' "${green}OK${reset} shell watchdog killed the run after {{shell_timeout}}s (status 143; libtest timeout bypassed)"; \ + printf '%s\n' "${green}OK${reset} shell watchdog killed the run after ${shell_timeout}s (status 143; libtest timeout bypassed)"; \ else printf '%s\n' "${red}FAIL${reset} expected the shell watchdog to kill the run (status=$status)" >&2 rc=1 @@ -212,11 +207,11 @@ _timeout-fixture script shell_timeout extra_flag expect: check-install # Prove the libtest (Lua) timeout layer: libtest cancels the slow case quickly, # before the shell watchdog backstop can fire. -test-timeout-lua: (_timeout-fixture "/tests/harness/slow-case.lua" "2" "'--timeout', '0.1'," "lua") +test-timeout-lua: (_timeout-fixture "/tests/harness/slow-case.lua" "${TRAP_CCLIBS_TEST_TIMEOUT_WATCHDOG_SECONDS:-1}" "'--timeout', '0'," "lua") # Prove the shell watchdog backstop: the slow case runs with the libtest timeout # bypassed (--no-timeout), so the shell watchdog kills the whole process. -test-timeout-shell: (_timeout-fixture "/tests/harness/slow-case.lua" "1" "'--no-timeout'," "shell") +test-timeout-shell: (_timeout-fixture "/tests/harness/slow-case.lua" "${TRAP_CCLIBS_TEST_TIMEOUT_WATCHDOG_SECONDS:-1}" "'--no-timeout'," "shell") # Fast regression guard for both timeout layers. Wired into `ci`. test-timeout: test-timeout-lua test-timeout-shell diff --git a/docs/adrs/adr-0009-layered-test-timeouts.md b/docs/adrs/adr-0009-layered-test-timeouts.md index 8cc9bb3..5a41715 100644 --- a/docs/adrs/adr-0009-layered-test-timeouts.md +++ b/docs/adrs/adr-0009-layered-test-timeouts.md @@ -35,11 +35,11 @@ that yield (the usual hang); a non-yielding CPU loop cannot be preempted in Comp **Layer 2 — shell watchdog (backstop).** The `Justfile` `test:` recipe keeps its existing `TRAP_CCLIBS_TEST_TIMEOUT_SECONDS` watchdog unchanged, as an independent double-check. Its -default sits *above* the libtest default (`.env.sample` ships `7`; the recipe falls back to -`7`) so libtest fires first in normal runs and the watchdog only catches what Lua cannot — -a non-yielding loop, a wedged libtest, or a deliberately bypassed case. Its SIGTERM message -is worded differently from the `libtest timeout` message, so the two layers are never -confused. +default matches the libtest default (`.env.test` ships `3`; the recipe falls back to `3`) +so libtest should fire first for yielding cases in normal runs and the watchdog only catches +what Lua cannot — a non-yielding loop, a wedged libtest, or a deliberately bypassed case. Its +SIGTERM message is worded differently from the `libtest timeout` message, so the two layers +are never confused. ## How To Write Tests Properly @@ -58,8 +58,9 @@ confused. - A hung case now fails in ~3s with a per-case message instead of taking down the whole process anonymously. - `just test-timeout` is a self-asserting harness regression guard wired into `just ci`. It - chains `test-timeout-lua` (Layer 1: libtest cancels the slow case at 0.1s, before a 2s - shell backstop) and `test-timeout-shell` (Layer 2: the 1s watchdog kills the slow case with + chains `test-timeout-lua` (Layer 1: libtest cancels the slow case immediately with + `--timeout 0`, before the shell backstop) and `test-timeout-shell` (Layer 2: the + `TRAP_CCLIBS_TEST_TIMEOUT_WATCHDOG_SECONDS` watchdog, default `1`, kills the slow case with libtest bypassed). Both drive a single `tests/harness/slow-case.lua` fixture; the tight timeouts — not the fixture's sleep length — decide which layer fires, so the harness itself is covered against regressions on every `ci`.