Maestro Orchestrator — Orchestration Framework (fail-closed + HITL)

日本語版: README.ja.md

If uncertain, stop. If risky, escalate.

Research / educational governance simulations for agentic workflows.

Maestro Orchestrator is a research-oriented orchestration framework for supervising agent workflows with fail-closed safety, HITL escalation, and audit-ready traceability.

This repository focuses on governance / mediation / negotiation-style simulations and implementation references for traceable, reproducible, safety-first orchestration.

It is designed to help inspect how orchestration layers should behave when a system encounters:

uncertainty
insufficient evidence
relative / unstable judgments
policy or ethics violations
escalation conditions requiring human review

The repository is intentionally structured as a research / educational bench, not as a production autonomy framework.

Purpose

Maestro Orchestrator is built around three priorities:

Fail-closed If uncertain, unstable, or risky, do not continue silently.
HITL escalation Decisions requiring human judgment are explicitly escalated.
Traceability Decision flows are reproducible and audit-ready through minimal ARL logs.

This repository is best read as a:

research prototype
educational reference
governance / safety simulation bench

It is not a production autonomy framework.

Safety Model

This repository prioritizes fail-closed behavior.

If a workflow becomes uncertain, policy-violating, unstable, or insufficiently grounded, it should:

STOP
PAUSE_FOR_HITL
or remain blocked until reviewed

The design goal is to avoid silent continuation under ambiguity.

Core safety ideas

Uncertain → stop or escalate
Risky → stop
Human judgment required → HITL
Sealed decisions remain sealed
Unknown external side effects are denied by default

External side effects

By default, the framework assumes a deny-by-default posture for actions that could affect the outside world, such as:

network access
filesystem writes
shell / command execution
messaging / email / DM
account, billing, or purchase actions
access to PII-bearing sources

This repository is primarily about control logic, mediation logic, and auditable simulation behavior, not unrestricted action execution.

What this repository is

This repository provides:

fail-closed + HITL orchestration benches for governance-style workflows
reproducible simulators with seeded runs and pytest-based contract checks
audit-ready traces via minimal ARL logs
reference implementations for orchestration / gating behavior

Typical themes in this repository include:

orchestration
mediation
negotiation
governance simulation
escalation policy
contract-style invariants
replayability
lightweight audit logs

Quickstart (recommended path)

v5.1.x is the recommended line for reproducibility and contract checks. v4.x is retained as a legacy stable bench.

Start with one simulator, confirm behavior and logs, then expand.

1) Run the recommended emergency contract simulator

python mediation_emergency_contract_sim_v5_1_2.py

This is the recommended entry point if you want:

reproducibility-oriented runs
contract-style checks
minimal audit output for inspection
incident-oriented abnormal-run analysis

2) Run the test suite

pytest -q

3) Inspect outputs

Look for:

emitted layer / decision / final_decider / reason_code
fail-closed stops
HITL-required paths
minimal ARL behavior
reproducible seeded outcomes

4) Run the legacy stable bench if needed

python mediation_emergency_contract_sim_v4_1.py

Use the v4.x line if you want an older stable benchmark path for comparison.

Main files and directories

Below is the practical map of the repository.

Core / main entry points

mediation_emergency_contract_sim_v5_1_2.py Recommended reproducible emergency-contract simulator
mediation_emergency_contract_sim_v5_0_1.py Earlier v5 line
mediation_emergency_contract_sim_v4_1.py Legacy stable bench
ai_doc_orchestrator_kage3_v1_2_4.py Document-oriented orchestration / gating reference
ai_doc_orchestrator_kage3_v1_3_5.py Expanded orchestration reference with benchmark-related helpers
loop_policy_stage3.py Stage-3 loop policy and HITL / stop logic

Repository structure

tests/ Contract tests, regression tests, orchestration behavior checks
benchmarks/ Benchmark-oriented tests and negotiation-pattern checks
docs/ Supporting documentation and diagrams
archive/ Archived experiments and older artifacts
.github/workflows/ CI and analysis workflow definitions

Supporting files

README.ja.md Japanese README
LICENSE License file
requirements.txt Python dependencies
pytest.ini Pytest configuration
log_codebook_v5_1_demo_1.json Demo codebook for emitted vocabulary / logging consistency
log_format.md Log-related documentation

Version guide

v5.1.x

Recommended when you want:

stronger reproducibility
contract-style vocabulary checks
minimal ARL / abnormal-run trace handling
benchmark-oriented inspection

v5.0.x

Earlier v5 line. Useful if you want to compare design evolution.

v4.x

Legacy stable benchmark line. Good for:

simpler baseline comparison
historical progression
compatibility checks with older tests or notes

Other simulators

The repository also contains multiple experimental or thematic simulators related to:

governance mediation
alliance / persuasion dynamics
hierarchy dynamics
reeducation / social dynamics
all-in-one mediation experiments

These are useful as reference material, but the recommended starting point remains v5.1.2.

Audit and logging model

A central design goal is audit-ready behavior without overcomplicating the log surface.

The repository uses lightweight audit patterns such as:

explicit decision
explicit reason_code
explicit final_decider
sealed vs non-sealed control paths
reproducible seeded runs
testable emitted vocabularies

In practical terms, the logs are meant to answer:

what was blocked
where it was blocked
why it was blocked
whether human intervention was required
whether the outcome can be reproduced

HITL semantics

The repository treats HITL as a first-class control path, not as an afterthought.

Typical behavior:

uncertain but non-sealed conditions → PAUSE_FOR_HITL
user continuation may allow progress in allowed cases
sealed safety outcomes remain non-overrideable
important judgment calls are surfaced explicitly

This makes the orchestration model easier to inspect, test, and replay.

Reproducibility

Reproducibility matters throughout the repository.

Common patterns include:

deterministic seeds
fixed emitted vocabularies
contract-style assertions in tests
explicit abnormal-run inspection
stable decision categories

The intent is not just to “run a simulation,” but to make its control behavior observable and comparable across runs.

Testing

The repository uses pytest-based checks to validate orchestration behavior.

Typical checks include:

emitted vocabulary consistency
gate invariants
fail-closed behavior
HITL continuation / stop semantics
benchmark output structure
regression behavior for known scenarios

Run all tests with:

pytest -q

Run a focused subset if needed:

pytest tests/test_benchmark_profiles_v1_0.py -q

CI / analysis workflows

The repository includes CI and analysis workflows under .github/workflows/.

These workflows are used to validate:

Python test execution
YAML validity
static analysis
repository hygiene
security-oriented reporting

The two primary badges in this README correspond to:

Python App CI
Tasukeru Analysis

Example usage mindset

This repository is most useful when you want to answer questions like:

How should an orchestrator behave under uncertainty?
When should a system stop instead of rerouting?
What should be escalated to HITL?
How can decision paths remain inspectable and reproducible?
How can orchestration rules be tested like contracts?

It is less about maximizing autonomy, and more about making orchestration behavior governable.

Non-goals

This repository is not intended to be:

a production agent platform
a general-purpose autonomous execution engine
a fail-open multi-tool runtime
a “keep going no matter what” orchestration layer

The emphasis is on controlled behavior, not maximum autonomy.

Research / educational note

This repository is provided for research and educational purposes.

It is intended to demonstrate:

orchestration control patterns
mediation / governance simulation structures
fail-closed guardrails
audit / replay-oriented design
HITL escalation semantics

It is not a promise of production readiness, completeness, or universal policy coverage.

License

See LICENSE.

Language

English README: README.md
Japanese README: README.ja.md

Summary

Maestro Orchestrator is a safety-first orchestration framework for studying how agent workflows should behave when they encounter uncertainty, risk, or human-judgment boundaries.

Its core stance is simple:

If uncertain, stop. If risky, escalate.

Name		Name	Last commit message	Last commit date
Latest commit History 996 Commits
.github/workflows		.github/workflows
archive		archive
benchmarks		benchmarks
docs		docs
mediation_core		mediation_core
scripts		scripts
tests		tests
.gitattributes		.gitattributes
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
agents.yaml		agents.yaml
agents.yaml.md		agents.yaml.md
ai_alliance_persuasion_simulator.py		ai_alliance_persuasion_simulator.py
ai_doc_orchestrator_kage3_v1_2_2.py		ai_doc_orchestrator_kage3_v1_2_2.py
ai_doc_orchestrator_kage3_v1_2_2_1.py		ai_doc_orchestrator_kage3_v1_2_2_1.py
ai_doc_orchestrator_kage3_v1_2_3.py		ai_doc_orchestrator_kage3_v1_2_3.py
ai_doc_orchestrator_kage3_v1_2_4.py		ai_doc_orchestrator_kage3_v1_2_4.py
ai_doc_orchestrator_kage3_v1_3_5.py		ai_doc_orchestrator_kage3_v1_3_5.py
ai_doc_orchestrator_with_mediator_v1_0.py		ai_doc_orchestrator_with_mediator_v1_0.py
ai_governance_mediation_sim.py		ai_governance_mediation_sim.py
ai_hierarchy_dynamics_full_log_20250804.py		ai_hierarchy_dynamics_full_log_20250804.py
ai_hierarchy_simulation_log.py		ai_hierarchy_simulation_log.py
ai_mediation_all_in_one.py		ai_mediation_all_in_one.py
ai_mediation_governance_demo.py		ai_mediation_governance_demo.py
ai_mediation_hitl_reset_full_kage_arl公開用_rfl_relcodes_branches.py		ai_mediation_hitl_reset_full_kage_arl公開用_rfl_relcodes_branches.py
ai_mediation_hitl_reset_full_with_unknown_progress		ai_mediation_hitl_reset_full_with_unknown_progress
ai_pacd_simulation.py		ai_pacd_simulation.py
ai_reeducation_social_dynamics.py		ai_reeducation_social_dynamics.py
copilot_mediation_min.py		copilot_mediation_min.py
dialogue_consistency_mediator_v2_2_research.		dialogue_consistency_mediator_v2_2_research.
kage_end_to_end_confidential_loopguard_v1_0.py		kage_end_to_end_confidential_loopguard_v1_0.py
kage_orchestrator_diverse_v1.py		kage_orchestrator_diverse_v1.py
log_codebook_v5_1_demo_1.json		log_codebook_v5_1_demo_1.json
log_format.md		log_format.md
loop_policy_stage3.py		loop_policy_stage3.py
mediation_basic_example.py		mediation_basic_example.py
mediation_emergency_contract_sim_v1.py		mediation_emergency_contract_sim_v1.py
mediation_emergency_contract_sim_v4.py		mediation_emergency_contract_sim_v4.py
mediation_emergency_contract_sim_v4_1.py		mediation_emergency_contract_sim_v4_1.py
mediation_emergency_contract_sim_v4_4.py		mediation_emergency_contract_sim_v4_4.py
mediation_emergency_contract_sim_v4_4_stress.py		mediation_emergency_contract_sim_v4_4_stress.py
mediation_emergency_contract_sim_v4_6_full.py		mediation_emergency_contract_sim_v4_6_full.py
mediation_emergency_contract_sim_v4_7_full..py		mediation_emergency_contract_sim_v4_7_full..py
mediation_emergency_contract_sim_v4_7_full_fixed_regex.py		mediation_emergency_contract_sim_v4_7_full_fixed_regex.py
mediation_emergency_contract_sim_v4_8.py		mediation_emergency_contract_sim_v4_8.py
mediation_emergency_contract_sim_v5_0_1.py		mediation_emergency_contract_sim_v5_0_1.py
mediation_emergency_contract_sim_v5_1_2.py		mediation_emergency_contract_sim_v5_1_2.py
mediation_process_log.tpy		mediation_process_log.tpy
mediation_with_logging.py		mediation_with_logging.py
multi_agent_architecture_overview.webp		multi_agent_architecture_overview.webp
multi_agent_hierarchy_architecture.png		multi_agent_hierarchy_architecture.png
multi_agent_mediation_with_reeducation.py		multi_agent_mediation_with_reeducation.py
pytest.ini		pytest.ini
rank_transition_sample.py		rank_transition_sample.py
requirements.txt		requirements.txt
run_benchmark_kage3_v1_3_5.py		run_benchmark_kage3_v1_3_5.py
run_benchmark_profiles_v1_0.py		run_benchmark_profiles_v1_0.py
stress_report_v4_7_draft_lint_100k_seed42.json		stress_report_v4_7_draft_lint_100k_seed42.json
stress_results_v4_4_1000.json		stress_results_v4_4_1000.json
stress_results_v4_4_10000.json		stress_results_v4_4_10000.json
stress_results_v4_6_100000 .json		stress_results_v4_6_100000 .json
stress_results_v4_6_100000.json		stress_results_v4_6_100000.json
stress_results_v5_1_2_10000_mixed.json		stress_results_v5_1_2_10000_mixed.json
test_end_to_end_confidential_loopguard_v1_0.py		test_end_to_end_confidential_loopguard_v1_0.py

Folders and files

Latest commit

History

Repository files navigation

Maestro Orchestrator — Orchestration Framework (fail-closed + HITL)

Purpose

Safety Model

Core safety ideas

External side effects

What this repository is

Quickstart (recommended path)

1) Run the recommended emergency contract simulator

2) Run the test suite

3) Inspect outputs

4) Run the legacy stable bench if needed

Recommended reading path

Main files and directories

Core / main entry points

Repository structure

Supporting files

Version guide

v5.1.x

v5.0.x

v4.x

Other simulators

Audit and logging model

HITL semantics

Reproducibility

Testing

CI / analysis workflows

Example usage mindset

Non-goals

Research / educational note

License

Language

Summary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 1

Languages

Packages