| name | strict-agent-loop |
|---|---|
| description | Enforce strict iterative execution for interactive long tasks and unattended Codex-supervised runs, with one small verified task per round, explicit round announcements, append-only disk logs, recoverable state, progress broadcasts, and machine-checkable stop conditions. Use when Codex must not skip the middle, must inherit prior context across rounds, and should continue until an external stop rule or real blocker is reached. |
Use this skill when the work is large, quality-sensitive, or easy for Codex to compress into a vague summary. The current Codex session is always the controller for the task it is handling. This skill supports two operating modes:
interactive: the controller stays in front of the user and reports every roundunattended: an outer supervisor repeatedly runs or resumes Codex while the inner controller keeps following the same strict loop protocol
Read protocol.md before starting. Read management.md before choosing or creating a task id. Read modes.md when choosing a mode. Read stop_checks.md when defining machine-checkable stop rules. Read recovery.md when recovering from executor loss, supervisor loss, or context pressure.
When the user asks how to use this skill, answer concretely.
- Explain that there are two modes:
interactive: the current Codex session remains user-facingunattended:scripts/supervise.pyowns the outer while-loop
- If the user did not choose a mode, show both quick starts.
- Always explain the managed layout:
.codex-loop/registry.json.codex-loop/tasks/<task-id>/state.json- task-local
events.jsonl,iterations.jsonl,status-history.jsonl,latest-status.txt,latest-stop-report.json,run-summary.md,rounds/, and unattended-onlysupervisor/
- Always distinguish workspace outputs from bookkeeping:
- real deliverables belong under
<workspace_root>/ - loop state and ledgers belong under
.codex-loop/tasks/<task-id>/
- real deliverables belong under
- Always mention
list_tasks.pyandshow_task.pywhen the repo may host multiple loops. - Always mention that unattended runs should rely on machine-checkable stop conditions, not only natural-language claims.
- For unattended usage, explain that fresh disk-based recovery is the default and
--supervisor-resume-existing-threadis opt-in. - Mention that
--supervisor-reasoning-effort low|mediumcan improve unattended availability when the provider is overloaded. - Mention that
supervise.pycan be interrupted withSIGINTorSIGTERM, saves state, and exits130. - Give the user an exact prompt or shell command, not only prose.
Default to the managed layout under <workspace_root>/.codex-loop/.
- Before starting work, check whether
.codex-loop/registry.jsonalready exists. - Each long-running objective should have one task id and one task root under
.codex-loop/tasks/<task-id>/. - If the user is starting new work and did not specify a task id, derive one from the goal by using
scripts/init_state.py. - If the user wants to resume and gives a task id, use that exact managed task.
- If the user wants to resume but did not name a task:
- if exactly one plausible running task exists, use it
- if several plausible tasks exist, show
scripts/list_tasks.pyoutput and ask which task to continue - Do not ask the user to enumerate storage paths unless they explicitly want custom paths.
- Keep actual work artifacts in the workspace root and keep the task root for loop bookkeeping unless the task explicitly requires otherwise.
Before iteration 1, define all of these:
goalglobal_stop_conditionworkspace_rootsuccess_evidenceblocker_definitionoperating_modestop_checkshard_limitsmax_iterationsmax_no_progress_rounds- optional context compaction threshold
- unattended only:
max_rounds_per_invocationmax_consecutive_failures
Do not start the loop while the goal or stop rule is materially unclear.
Do not rely on memory alone. At minimum, keep these artifacts current and queryable:
.codex-loop/registry.json.codex-loop/tasks/<task-id>/state.json.codex-loop/tasks/<task-id>/events.jsonl.codex-loop/tasks/<task-id>/iterations.jsonl.codex-loop/tasks/<task-id>/status-history.jsonl.codex-loop/tasks/<task-id>/latest-status.txt.codex-loop/tasks/<task-id>/latest-stop-report.json.codex-loop/tasks/<task-id>/run-summary.md.codex-loop/tasks/<task-id>/rounds/iteration-XXXX.md
If unattended mode is active, also keep:
.codex-loop/tasks/<task-id>/supervisor/
The in-memory history window in state.json may be compacted.
The full append-only record still lives in iterations.jsonl, events.jsonl, status-history.jsonl, and rounds/.
Interactive mode is for long tasks where a human is present and wants to see every round.
Rules:
- Keep the current Codex session as controller.
- Use one persistent executor subagent whenever possible.
- Before each round, tell the user:
- iteration number
- completed rounds so far
- this round
- local done condition
- global stop condition
- stop after this round if
- recent average round time and estimated remaining time when available
- Write the same announcement to
events.jsonlwithscripts/append_event.py. - After the round, verify, record it with
scripts/update_state.py, re-check stop conditions withscripts/check_stop.py, then refreshlatest-status.txt,status-history.jsonl,latest-stop-report.json, andrun-summary.mdwithscripts/report_status.py.
Unattended mode is for long-running work where the outer while-loop must survive beyond one Codex invocation.
Architecture:
- the outer loop lives in
scripts/supervise.py - the supervisor starts or resumes Codex with
codex execorcodex exec resume - the inner Codex session still uses this skill as controller
- disk artifacts bridge one invocation to the next
- fresh disk-based recovery is the default; reusing the same Codex thread is opt-in through
--supervisor-resume-existing-thread
Rules:
- The supervisor owns outer repetition.
- The inner controller still owns task decomposition, verification, and executor management for its current invocation.
- Each unattended invocation must stop cleanly after:
- the global stop condition is met
- a real blocker is reached
- the per-invocation round budget is consumed
- The supervisor relays inner Codex messages plus command start/completion events to outer stdout.
SIGINTandSIGTERMshould cause the supervisor to persist state, record the interruption, and exit with code130.- Because no human is present, round announcements must still be written to disk.
- The supervisor must refresh durable progress broadcasts so the run does not look dead.
Broadcasting is mandatory in both modes.
Interactive mode:
- tell the user directly
- write the round announcement to
events.jsonl - refresh
status-history.jsonl,latest-status.txt, andrun-summary.md
Unattended mode:
- write
round.startedannouncements toevents.jsonl - refresh
status-history.jsonl,latest-status.txt,latest-stop-report.json, andrun-summary.md - keep visible outer-stdout broadcasts informative; operators should be able to see progress without opening the logs first
- let the supervisor print heartbeat-style summaries that include:
- completed iteration count
- approximate progress bar
- recent iteration times
- recent average iteration time
- estimated remaining time when possible
Each round must stay objectively small.
Good atomic tasks:
- reproduce one failure
- compute one next hailstone number
- patch one isolated bug
- add one regression test
- update one document section
- run one validation command and interpret it
Bad atomic tasks:
- finish the whole feature
- fix all remaining issues
- implement, test, document, and polish everything
For every round:
- Announce the round.
- Log the announcement with
scripts/append_event.py. - Do exactly one atomic task.
- Verify the result with evidence.
- Record the verified round with
scripts/update_state.py. - Re-check machine stop conditions with
scripts/check_stop.py. - Refresh durable progress outputs with
scripts/report_status.py. - Compact state with
scripts/compact_state.pywhen context pressure grows.
Never mark progress as complete without evidence. Never skip verification because the executor probably did it right.
Natural-language stop conditions are not enough for unattended runs. Whenever possible, define machine checks such as:
--stop-command "pytest -q"--stop-command "ruff check ."--require-path docs/feature.md--require-text "README.md::hailstone sequence"
If stop checks exist, treat them as the authority. Do not claim success while stop checks still fail.
If the executor disappears or loses context:
- compact the state
- rebuild from
context_snapshot,run-summary.md, recenthistory,iterations.jsonl, andevents.jsonl - spawn a replacement executor
- continue without restarting iteration numbering
If unattended mode loses its Codex thread:
- keep the same state file and append-only logs
- clear the stored thread id
- let the supervisor start a fresh Codex invocation
- recover from disk state, not from memory
If unattended mode hits a real read-only filesystem write failure:
- preserve the current task state
- disable stored thread resume
- recover on the next cycle from the disk trail only
Interactive quick start:
- Use
$strict-agent-loop. - Tell Codex the goal, stop condition, and evidence.
- Require durable round announcements and progress reporting.
Unattended quick start:
- Initialize a managed task with
scripts/init_state.py --workspace-root <repo> --task-id <task-id> --operating-mode unattended. - Define machine stop checks.
- Start
scripts/supervise.pyagainst.codex-loop/tasks/<task-id>/state.json.