Skip to content

cyberfabric/cyber-pilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

523 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Cypilot Banner

License Version Python Coverage Quality Gate Status Duplicated Lines (%) Bugs

Version: 3.7.0-beta

Status: Active

Audience: Developers using AI coding tools, technical leads, engineering teams

Convention: πŸ’¬ = paste into AI coding tool chat. πŸ–₯️ = run in terminal.

Overview

Cyber Pilot is a traceable delivery system for requirements, design, plans, and code.

Stable identifiers and references connect requirements, design, plans, and implementation so drift is surfaced early instead of being reconstructed ad hoc during review and delivery.

For teams already using an AI coding tool, Cyber Pilot provides the operating controls needed to keep requirements, design, plans, and code traceable, reviewable, and enforceable as artifacts and implementation change:

  • stable identifiers and cross-link validation to prove alignment across requirements, design, plans, and code
  • deterministic cpt validation to check structure, references, consistency, and traceability locally and in CI
  • templates, checklists, and staged workflows to gate generation, review, and validation through explicit stages with defined inputs, outputs, and checks

Jump to: Product shape | Fit and non-fit | Operating model | Traceability and validation model | Workflow model | Typical delivery sequence | Supported hosts | Evaluate Cyber Pilot | Installation and setup reference

Product shape

Authoritative delivery artifacts

  • Requirements and design artifacts become the approved, file-backed source of scope, intent, and constraints for downstream work.
  • Plans turn that approved intent into bounded execution shape before implementation sprawls across one long chat.
  • Checklists make review and validation expectations visible instead of leaving them implicit in chat or memory.
  • Implementation changes are reviewed against those approved artifacts rather than as isolated code diffs.

What Cyber Pilot adds to a repo

After cpt init and cpt generate-agents, Cyber Pilot typically adds a setup directory named cypilot/, generated AI coding tool integration files, and user-editable configuration under config/ inside that setup directory.

cypilot/ is the normal default for user projects. Self-hosted development in this repository uses .bootstrap/ as a repo-specific exception described in CONTRIBUTING.md.

This repo-installed control surface is how Cyber Pilot becomes operationally real inside a repository rather than staying a chat convention. It is also the first concrete proof surface most teams can inspect directly: what is generated, what remains user-editable, what is optional, and what deterministic validation can see.

  • Generated β€” AI coding tool integration files and repository wiring
  • User-editable β€” project configuration, rules, and any installed kit content meant for local use
  • Optional β€” installed kit content extends the base platform only when you want a more opinionated delivery model
  • Validator-visible β€” artifacts, plans, and configuration participate in deterministic cpt checks when those configured surfaces are in use
Surface Typical location Ownership
Setup directory cypilot/ Created by setup; contains both generated and user-editable material
Host integration files .windsurf/, .cursor/, .claude/, .github/, .codex/, .agents/ Generated by cpt generate-agents; regenerate when host integration changes
Project config cypilot/config/ User-editable and reviewable in the repo
Installed kit content cypilot/config/kits/{slug}/ User-editable local delivery surface for that kit
Self-hosted bootstrap copy in this repo only .bootstrap/ Contributor-only special case; not the normal user-project layout

Core platform and optional kits

Cyber Pilot has two main parts:

  • Core platform β€” the repository wiring, workflow routing, configuration surfaces, deterministic validation, and chat-facing skill that make the delivery model operational and repeatable
  • Kits β€” optional add-ons that specialize that same delivery model with domain-specific templates, rules, workflows, and validation material

Most teams should start with the core platform and add a kit later only if they want a ready-made delivery model for a specific domain or way of working. Kits extend the same underlying system rather than introducing a separate product shape.

How teams encounter Cyber Pilot

In practice, teams usually encounter and touch Cyber Pilot through four main surfaces in the repository and toolchain:

Surface Form Role
Primary AI surface cypilot <workflow>: <request> Main chat entry point for plan, generate, and analyze requests
Deterministic CLI cpt <command> Setup, validation, updates, and repeatable local or CI checks
Generated AI coding tool integration files generated files in the repository Connect the repository or workspace to supported tools without manual setup in each host
Optional kit content installed kit content Add domain-specific templates, rules, workflows, and validation material

Fit and non-fit

Use Cyber Pilot if you already work with an AI coding tool and the cost of ambiguity, rework, or review failure is high enough to justify more structure and control.

Helps most when you are responsible for

  • Implementation work that needs to stay bounded, inspectable, and safer to continue across more than one step
  • Alignment across handoffs or review checkpoints so requirements, design, plans, and implementation do not drift apart
  • Review or delivery accountability when approved scope needs reviewable evidence, clearer status surfaces, or deterministic checks before merge

Good fit when

  • you have a multi-step, higher-risk, or review-sensitive change where the cost of ambiguity or rework is higher than the cost of added structure
  • the work needs bounded execution and reviewable alignment across approved inputs, implementation, and review, not just a quick diff
  • you are working in a brownfield or unfamiliar area and need understanding before editing, not just speed while editing
  • coordination, handoffs, or deterministic validation materially reduce risk before review or merge

Not the best fit when

  • the task is a tiny edit, throwaway spike, or open-ended exploration
  • the change is already well understood, low risk, and fast to make locally without added coordination or review structure
  • speed matters more than structure and the shape of the work is still unclear
  • you do not want artifact-backed process, staged review, or deterministic validation overhead even when the task is non-trivial
  • your team rejects workflow discipline or does not want to maintain the delivery surfaces that make review and validation easier

Where value appears first

  • a first bounded plan that makes a risky or unfamiliar change easier to execute in stages
  • a first inspectable understanding surface for an unfamiliar area before you start changing code
  • a first deterministic validation result or drift signal before review or merge instead of trusting one generation pass
  • a first reviewable linkage between approved inputs and the implementation under review

Operating model

System boundary and control model

Cyber Pilot is best understood as the workflow, context, and validation layer around your AI coding tool.

Four actors shape the operating model: the AI coding tool provides the environment, chat interface, and model access, the agent performs the reasoning and writing inside that environment, Cyber Pilot governs the repo-attached workflow, configuration, and validation surface around the work, and the human decides approval, adequacy, risk acceptance, and whether the result is acceptable to merge or ship.

Cyber Pilot makes that repo-attached surface more explicit by controlling what context and rules are loaded, what structured artifacts or checkpoints the task is expected to use, and what deterministic checks can later be run with cpt. It does not supply the underlying intelligence of the model, and it does not decide whether the final implementation is correct, well-designed, or acceptable to merge.

  • Use the agent for

    • reasoning
    • writing
    • transformation
    • implementation judgment
  • Use Cyber Pilot for

    • workflow selection and task framing
    • task-matched context and rule loading
    • templates, rules, and checklists
    • governing the repo-attached workflow, configuration, and validation surface around the work
    • bounding larger tasks into more controllable execution steps

Deterministic vs non-deterministic boundary

For the same configured project surface and the same command or request shape, Cyber Pilot should make the same routing, context-loading, and check-execution decisions.

  • Deterministic

    • config and resource resolution
    • routing into workflows and specialized commands
    • loading the same configured context and rules for the same task shape
    • invoking the same checks against the same configured project surface
  • Non-deterministic

    • the agent's reasoning, writing quality, design quality, and implementation judgment
    • adequacy of the final solution
    • human approval, review, and merge decisions

This does not imply the same reasoning trace, implementation approach, code, or solution quality from run to run.

Cyber Pilot can constrain process, route work, and surface evidence repeatably, but it cannot guarantee implementation quality or replace human review.

  • What tradeoff does Cyber Pilot make?
    • more maintained artifacts, explicit checkpoints, and review surface in exchange for more control, auditability, and repeatability

For the full fit / non-fit guidance, practical anti-patterns, planning heuristics, and workflow-choice rules, use guides/USAGE-GUIDE.md.

Traceability and validation model

Cyber Pilot is strongest when the delivery surface is explicit and checkable.

The inspectable surface is the file-backed repository material a human can open, diff, review, and compare over time. The configured enforcement surface is the validator-visible subset of that material that the repository has explicitly chosen to subject to deterministic cpt checks.

Inspectable delivery surface

  • File-backed artifacts keep requirements, design, plans, and implementation visible as inspectable delivery inputs and outputs.
  • Stable identifiers and cross-links connect those artifacts through one shared traceability surface.
  • Templates, checklists, and file-backed plans create review surfaces that can be inspected, diffed, and repeated.
  • Validation and review outputs become visible evidence alongside the work products they refer to.
  • Drift signals become operationally visible through broken links, failed checks, and missing required structure instead of being reconstructed ad hoc later.

Configured enforcement surface

  • Not every inspectable artifact is automatically enforced; deterministic enforcement applies only to file-backed, validator-visible material the repository has configured cpt to check.
  • Enforceable means configured + validator-visible + deterministic rather than inferred from everything a human can see in the repository.
  • IDs, required links, document structure, plans, and stage completeness become enforceable when they are part of that configured validation surface.
  • The same configured surface can be checked locally and in CI so enforcement is repeatable instead of chat-dependent.

Evidence chain across a change

  • Requirement captures the approved scope.
  • Design records the intended structure, constraints, or boundary decisions.
  • Plan breaks the change into bounded execution steps.
  • Implementation provides traceable linked evidence back to that approved scope.
  • Validation result shows whether the configured structure, links, and review surfaces still hold.

The chain exists through explicit linked artifacts, stable identifiers or references, file-backed plans or checkpoints, and validation outputs tied to the configured surface. It helps surface drift and broken alignment operationally; it does not prove semantic equivalence between the requirement and the implementation.

What cpt enforces

These are the main deterministic conformance classes applied to that configured surface.

  • Artifact and document structure such as required shape, expected sections, and validator-visible files
  • Identifier and reference integrity across requirements, design, plans, code, and their cross-links
  • Required links and traceability rules that keep artifacts aligned through the same stable identifiers
  • TOC and document consistency where those checks are part of the configured validation surface
  • Plan, checklist, and stage completeness when those surfaces are file-backed and explicitly configured for checking

What Cyber Pilot cannot prove

  • Behavioral correctness, absence of defects, and implementation quality remain non-deterministic and still require review.
  • Soundness of design decisions and adequacy of tests remain judgment-based even when the artifacts, structure, and links are checkable.
  • Business or product adequacy remains outside deterministic proof.
  • Human approval, merge, and ship decisions remain judgment-based even when the evidence surface is strong.

Cyber Pilot can surface missing, broken, stale, or inconsistent evidence without proving that the implementation is correct or adequate.

Workflow model

Cyber Pilot has three core workflows. Each has a portable chat form and, in some hosts, a matching slash-command alias.

Workflow Portable chat form Matching alias in some hosts Use it when
Plan cypilot plan: ... /cypilot-plan the task is too large, risky, or context-heavy for one conversation
Generate cypilot generate: ... /cypilot-generate you want to create, update, implement, or configure something
Analyze cypilot analyze: ... /cypilot-analyze you want to validate, review, inspect, compare, or audit

The portable cypilot <workflow>: ... form is the best default. Slash commands are host-specific aliases, not separate capabilities.

plan, generate, and analyze are reusable workflow modes, not a fixed mandatory sequence. They define how work is framed; the next section shows one common delivery order in which teams often combine them.

For default routing priorities and detailed workflow-choice advice, use guides/USAGE-GUIDE.md.

Typical delivery sequence

This is one common order for combining the workflows when an early idea or PoC needs to become a production-ready change without losing scope or design intent.

In practice, teams usually move through four visible stages:

  1. Approve the requirement and design so the change starts from explicit scope and constraints.
  2. Use plan to split larger work into bounded phases before execution sprawls across one long chat.
  3. Use generate within approved scope so implementation stays tied to the intended change.
  4. Use analyze and deterministic checks before merge so review sees both the implementation and its validation surface.

In practice, this creates clearer boundaries, earlier drift detection, and more reliable review evidence than one long mixed-purpose chat.

Supported hosts

Cyber Pilot works across multiple AI coding tools through the same portable cypilot workflow model, but some hosts preserve its workflow boundaries more fully than others. The differences are mainly in orchestration control, workflow separation, subagent support, manual discipline burden, and first-run clarity.

Host Workflow support profile Operational tradeoff
Claude Code Strongest starting point for the full Cyber Pilot workflow Preserves workflow separation, subagent-assisted isolation, and separate generation/review passes with the least manual reconstruction
Cursor Good editor-first support for everyday Cyber Pilot use Portable workflows still work well, but orchestration boundaries and isolation are less explicit than in stronger workflow-oriented hosts
GitHub Copilot Usable for structured GitHub-centered Cyber Pilot work The same portable workflow model applies, but phase separation and task orchestration need more manual steering than in Claude Code
OpenAI Codex Best for bounded, tightly scoped Cyber Pilot work Works best when workflow boundaries are narrow and explicit; less natural for broader multi-stage delivery flow
Windsurf Usable when you enforce workflow discipline manually Portable workflows still apply, but weaker isolation means generation and review should stay in separate chats by convention

If you are unsure where to start, Claude Code currently gives the clearest first experience for the full Cyber Pilot workflow because it best preserves workflow separation, orchestration control, and subagent-assisted isolation.

For host-specific setup guidance, deeper tradeoffs, and the full support matrix, use guides/AGENT-TOOLS.md.

Evaluate Cyber Pilot

Use this path if you are evaluating Cyber Pilot in a real repository and want one concrete result quickly.

Minimal evaluation path

  1. Pick one real repository and one narrow real input such as a requirement, design note, or change request that should produce a bounded, reviewable output.
  2. Complete the one-time setup for that repository using the installation and setup reference below so the repo is initialized and ready for Cyber Pilot.
  3. Activate Cyber Pilot in chat with πŸ’¬ cypilot on in the AI coding tool attached to that repository.
  4. Run one focused request with πŸ’¬ cypilot analyze: ... when you want an inspectable assessment of the input, or πŸ’¬ cypilot plan: ... when you want bounded execution steps before implementation.

Validation checkpoint

  • Run one deterministic check with πŸ–₯️ cpt validate --local-only when you want to verify only the current repository, or πŸ–₯️ cpt validate when cross-repo or workspace resolution is part of the trial.
  • Use that validation step as the proof surface of the trial; this is where Cyber Pilot shows that it produced deterministic, validator-visible signals instead of only conversational output.
  • Treat either a clean pass or an actionable failure as useful evidence; the most useful failures are localized, inspectable, and actionable rather than vague.

What success looks like

  • The output stays anchored to the real requirement, design note, or change request you started from.
  • One bounded and reviewable output appears such as a plan, inspectable summary, or validation surface you could act on immediately.
  • One deterministic validation signal appears as either a clean local pass or a concrete failure you can inspect and act on.
  • The next decision is clearer than it was before the trial, whether that means continue, narrow scope, or stop.

What to inspect after the trial

  • Scope anchoring β€” whether the output stayed tied to the real requirement, design note, or change request.
  • Reviewability β€” whether the resulting artifacts, plans, or validation outputs are easier to inspect than one long mixed-purpose chat.
  • Evidence quality β€” whether the outputs or failures are localized, inspectable, and usable by someone other than the original chat author.
  • Signal-to-effort β€” whether the trial produced enough useful signal to justify the setup and process overhead.
  • Trust signal β€” whether you would trust the resulting surface enough to continue with a larger change.

Jump to: Installation and setup reference | Configuration files | Extended operating modes | Project extensibility | Further reading

Installation and setup reference

Prerequisites

For a first trial, you need Python 3.11+, Git, and one supported AI coding tool such as Claude Code, Cursor, Windsurf, GitHub Copilot, or OpenAI Codex.

Python 3.11+ is the runtime for Cyber Pilot's repository-local scripts and CI, even when you do not install cpt globally yourself.

pipx is recommended when you want to install the cpt CLI globally and run it yourself. gh is optional for PR review and PR status workflows.

Setup paths and commands

Choose the path that matches the repository state.

  • If the repository already includes Cyber Pilot

    • ensure Python 3.11+ is available for the repository-local scripts and CI
    • clone or open the repository in your supported AI coding tool
    • activate Cyber Pilot in chat with πŸ’¬ cypilot on
    • send one focused request with πŸ’¬ cypilot analyze: ... or πŸ’¬ cypilot plan: ...
  • If the repository does not yet include Cyber Pilot

    • install cpt globally if you want to bootstrap the repository yourself
    • run the one-time repository setup steps below
    • then activate Cyber Pilot in chat and send the first focused request

If you need to bootstrap the repository yourself, use this one-time path:

  1. Install the CLI

    πŸ–₯️ Terminal:

    pipx install git+https://github.com/cyberfabric/cyber-pilot.git
    cpt --version
  2. Initialize the repository

    πŸ–₯️ Terminal:

    cpt init
    cpt generate-agents

cpt init and cpt generate-agents are one-time repository bootstrap steps, not steps every downstream user must repeat.

This creates a default setup directory `cypilot/`, generated AI coding tool integration files, and user-editable configuration under `config/` inside that setup directory.
  1. Activate Cyber Pilot in the AI coding tool chat:

    cypilot on
    
  2. Run one focused request with πŸ’¬ cypilot analyze: ... or πŸ’¬ cypilot plan: ...

For detailed host-specific setup, troubleshooting, and operational walkthroughs, use guides/AGENT-TOOLS.md and guides/USAGE-GUIDE.md.

Configuration files

The main top-level user-editable configuration lives under config/ inside your Cyber Pilot setup directory. Other parts of the setup directory may contain generated or supporting material, and installed kits can add their own editable surfaces.

A quick ownership rule:

Surface Ownership
cypilot/config/ User-editable control surface
cypilot/config/kits/{slug}/ Editable installed-kit content
Host integration files such as .windsurf/, .cursor/, .claude/, .github/, .codex/, .agents/ Generated by cpt generate-agents
.bootstrap/ Self-hosted contributor-only context

You do not need full configuration mastery immediately. Treat these as the main top-level control files you can inspect, review, edit, and version in the repository.

File What it controls
core.toml Top-level project settings, installed kits, and workspace registration
artifacts.toml The project's artifact model, codebase mappings, and traceability structure
AGENTS.md Task navigation rules that tell the agent which files to load for each job
SKILL.md Always-on project instructions that apply across requests
rules/*.md Optional topic-specific rules the agent loads for relevant tasks

For full configuration details, advanced surfaces, and editing patterns, see Configuration guide.

Extended operating modes

You do not need these on day one. Add them when your use case justifies the extra surface area.

Multi-repo workspaces

Cyber Pilot supports multi-repo workspaces so related docs, code, and shared kit assets can live in separate repositories and still stay aligned.

Use this when docs, code, or shared kit assets live in separate repos and still need to stay aligned.

Workspaces expand the reachable repository set. They do not replace project-level extensibility inside one repository.

For practical guidance, see guides/USAGE-GUIDE.md. For the full model and configuration rules, see requirements/workspace.md.

RalphEx delegation

RalphEx support is optional.

When available, Cyber Pilot can hand off selected execution work to RalphEx under human supervision.

Use this when you want supervised execution handoff for bounded tasks instead of keeping all work interactive inside the current AI coding tool.

For when to delegate and how human review fits, see guides/USAGE-GUIDE.md.

Project extensibility

Cyber Pilot supports project-level extensibility, not just installable kits.

Cyber Pilot can also load project-defined skills, subagents, workflows, and rules, so teams can extend behavior without packaging everything as a kit.

Project extensibility changes the behavior available inside one repository. Workspaces connect multiple repositories. Teams can use both together: keep cross-repo traceability through workspaces while extending the local project behavior through project-defined skills, workflows, and rules.

For the full model and examples, see guides/PROJECT-EXTENSIBILITY.md.

Further reading

Recommended reading path: README -> Usage guide -> Agent tools guide -> Configuration guide.


Feedback and issues

If you think a workflow is unclear, instructions behave incorrectly, a script behaves incorrectly, or important corner cases are missing, please open a GitHub issue.

The most useful issue reports usually include:

  • A short summary
  • Affected file, workflow, script, or exact command
  • Minimal reproduction steps
  • Expected vs actual behavior
  • Evidence such as exact command output, logs, validator output, screenshots, or a minimal prompt or plan slice
  • Environment details such as OS, AI coding tool, model, and Cyber Pilot version if known

Contributing

If you want to contribute, start with CONTRIBUTING.md.


License

Cyber Pilot is licensed under the Apache License 2.0. See LICENSE for details.