Coordination Testing Strategy

Testing strategy for the shard coordination protocol and its simulation infrastructure. Four tiers target distinct quality dimensions: isolation, invariant interaction, workflow correctness, and large-scale randomized validation.

Split-replace planner construction (SplitReplacePlan, plan_split_replace*) is contracts-owned in gossip-contracts::coordination::split; coordination tests validate backend execution semantics using those shared planner surfaces.

For protocol details see boundary-2-coordination.md. For simulation architecture see simulation-harness.md.

A family-oriented execution seam in crates/gossip-scanner-runtime/ and gossip_contracts::connector bridges coordination and scan execution. Coordination tests continue to validate protocol invariants independently of scan execution internals, while runtime integration tests exercise the same family entrypoints used by the CLI and worker binaries.

The etcd scaffold in crates/gossip-coordination-etcd/ adds backend-specific tests alongside the shared protocol suite: configuration/keyspace/codec tests stay local and deterministic, while ignored local-etcd integration coverage auto-provisions quay.io/coreos/etcd:v3.5.15 via testcontainers (or reuses ETCD_ENDPOINTS) and verifies EtcdCoordinator::connect() / AsyncEtcdCoordinator::connect() + status(), persisted run creation/registration, acquire/checkpoint/renew, terminal shard transitions (complete, park_shard), unpark, split round trips, shard-limit rejection paths, derived-ID collision handling, and deterministic split-atomicity aborts via test-only owner-drop fault injection. A separate #[ignore] test (lease_ttl_expiry_full_cycle) exercises wall-clock lease TTL expiry against live etcd: it verifies that the owner binding is auto-deleted while the shard record survives, that a second worker can reacquire with an advanced fence, and that the original worker's stale lease is rejected for both renew and checkpoint. The binary codec (codec.rs) has dedicated round-trip fuzz targets and proptest coverage for ShardRecord serialization in codec_tests.rs. Deterministic simulator tests cover the in-process etcd path via SimulatedEtcdKV and SimEtcdCoordinator, including compare-miss contention, uncertain commits, lease-expiry races, and retry-exhaustion bursts. The ignored local-etcd tests remain the coverage layer for live gRPC transport and real-member integration. A dedicated behavioral_conformance.rs module mirrors the shared coordination conformance scenarios against the real EtcdCoordinator, using only public CoordinationBackend / RunManagement operations plus persisted read-back helpers (test_load_shard_snapshot, test_load_run_snapshot) instead of simulation-only introspection.

Allocation Policy Scope (Tiered)

Coordination follows a HOT/WARM/COLD allocation policy:

HOT paths keep allocation-silent behavior where practical (acquire_and_restore_into, checkpoint, complete, claim internals).
WARM query/admin paths prioritize simpler contracts and may grow caller buffers (list_shards_into, collect_claim_candidates_into).
COLD lifecycle/setup paths optimize for clarity and atomicity over preallocation plumbing (register_shards preflight + staged rollback).

Simulation/test-support code (crates/gossip-coordination/src/sim/) remains free to allocate for readability, workload generation, and diagnostics.

Feature-aware enforcement remains in place for production-facing signatures:

#[cfg(not(feature = "test-support"))] compile-time guards in crates/gossip-coordination/src/conformance_tests.rs preserve borrowed/scratch hot-path forms.
The same guards are intentionally absent with test-support enabled, so simulation infrastructure can evolve without hot-path false positives.

1. Testing Pyramid

                                                ┌─────────────────────┐
                    ┌──────────────┐            │  TLA+ Model Check   │
                    │  Simulation  │  Tier 4    │  (specs/)           │
                    │  (sim/)      │            │                     │
                    ├──────────────┤            │  Exhaustive design  │
                    │  Scenario    │  Tier 3    │  verification of    │
                    │              │            │  protocol safety    │
                ┌───┴──────────────┴───┐       │  and liveness       │
                │    Conformance       │ T2    │  properties.        │
                │                      │       │                     │
            ┌───┴──────────────────────┴───┐   │  Parallel layer,    │
            │         Unit Tests           │ T1│  not a pyramid tier │
            │                              │   │  (verifies design,  │
            └──────────────────────────────┘   │  not Rust code).    │
                                                └─────────────────────┘

Each tier builds on the one below it. Unit tests validate individual operations against the InMemoryCoordinator. Conformance tests compose two or more invariants and verify they hold simultaneously. Scenario tests chain operations into realistic workflows. Simulation tests sweep hundreds of random seeds under fault injection, validating all nine safety invariants at every step. TLA+ model checking (Section 11) operates as a parallel verification layer that exhaustively verifies the protocol design.

2. Tier 1 — Unit Tests

File: crates/gossip-coordination/src/in_memory_tests.rs Declared in: crates/gossip-coordination/src/in_memory.rs (via #[path = "in_memory_tests.rs"] attribute)

Single-operation isolation tests against InMemoryCoordinator. Each test exercises one backend operation or one edge case, with deterministic logical time advanced via now(t).

Coverage areas

Area	Example tests
Acquire	`acquire_basic`, `acquire_not_found`, `acquire_already_leased`, `acquire_after_lease_expiry`, `acquire_terminal_rejected`
Renew	`renew_basic`, `renew_stale_fence`
Checkpoint	`checkpoint_basic`, `checkpoint_op_id_conflict`
Complete	`complete_basic`, `complete_replay_after_terminal`
Park	`park_basic`, `park_replay_after_terminal`
Split (replace)	`split_replace_basic`, `split_replace_replay`, `split_replace_child_id_determinism`
Split (residual)	`split_residual_basic`, `split_residual_cursor_out_of_bounds`, `split_residual_replay_via_spawned_after_eviction`
Fencing	`only_latest_fence_holder_can_mutate`
Op-log eviction	`op_log_eviction_treats_old_op_as_new`
Tenant isolation	`acquire_wrong_tenant_returns_not_found`, `checkpoint_wrong_tenant_returns_not_found`, etc.
Shard count limits	`split_replace_exceeds_per_tenant_limit`, `split_residual_exceeds_global_limit`, `register_shards_exceeds_per_tenant_limit`
Run lifecycle	`terminal_op_happy_path`, `cancel_run_from_initializing`, `terminal_op_idempotent_replay`, `complete_run_not_found`
Unpark	`unpark_shard_happy_path`, `unpark_shard_fence_epoch_bumped`, `unpark_shard_cursor_preserved`, `unpark_shard_rejected_when_run_terminal`
List shards	`list_shards_filter_active`, `list_shards_filter_available`, `list_shards_filter_parked`, `list_shards_filter_root_only`
Integration	`full_lifecycle_acquire_checkpoint_split_residual_complete`, `full_run_lifecycle_create_register_process_complete`

The module also includes property-based tests (proptest) that fuzz random operation sequences against the coordinator and assert structural invariants (record consistency, fence monotonicity, cursor monotonicity, idempotent replay) after every step.

All tests use shared fixtures from test_fixtures.rs (see Section 5).

3. Tier 2 — Conformance Tests

Harness: crates/gossip-coordination/src/conformance.rs In-memory binding: crates/gossip-coordination/src/conformance_tests.rs Declared in: lib.rs (#[cfg(any(test, feature = "test-support"))] pub mod conformance;, #[cfg(test)] mod conformance_tests;)

Invariant-interaction tests. Each test composes two or more of the protocol's safety invariants and verifies they hold simultaneously. The focus is on invariant combinations — these are the tests most likely to catch regressions where a fix for one invariant violates another. conformance.rs owns the reusable backend-agnostic harness, while conformance_tests.rs keeps the in-memory backend entrypoint and the production-only compile-time API guards.

Group A: Cross-Cutting Invariant Interactions

Test	Invariants composed
`fence_monotonicity_across_full_lifecycle`	Fence monotonicity + lease expiry enables re-acquisition + checkpoint/complete do not mutate fence
`cursor_monotonicity_combined_with_split_residual`	Cursor bounds + split-residual spec narrowing + cursor preservation across splits
`idempotency_before_terminal_state_rejection`	OpId idempotency + terminal irreversibility + their priority ordering
`owner_divergence_with_matching_fence`	Lease validation (fence + owner identity) + tenant isolation
`terminal_clears_lease`	Terminal transitions clear leases + correct terminal status per operation (complete, park, split_replace)

Group B: Supplementary Coverage Tests

Edge cases with zero or minimal coverage elsewhere.

Test	Gap filled
`cursor_semantics_dispatched_through_coordinator`	`CursorSemantics::Dispatched` propagation through acquire, checkpoint, complete
`lease_deadline_at_exact_boundary`	Half-open lease interval: `now < deadline` active, `now == deadline` expired
`split_coverage_key_range_partition`	Split-replace children's key ranges form a contiguous, gap-free partition
`oplog_eviction_then_replay`	Op-log eviction interaction with idempotency on terminal shards
`unpark_lifecycle_fence_and_cursor_preserved`	Full park/unpark round-trip preserves cursor, bumps fence, allows re-acquire
`same_worker_reacquire_bumps_fence`	Same-worker reacquire bumps fence, old lease rejected with `StaleFence`

Group C: Run-Level Conformance

Run lifecycle state machine tests.

Test	Property
`run_terminal_irreversibility`	Done run rejects complete_run, fail_run, and cancel_run
`register_shards_on_non_initializing_rejected`	Registration requires Initializing status
`run_completed_at_consistency`	`completed_at` is `Some` iff run is terminal
`unpark_after_run_terminal_rejected`	Terminal run blocks shard-level unpark (`RunTerminal`)

Compile-time guards

const _: () = assert!(ShardRecord::OP_LOG_CAP == 16);
const _: () = assert!(MAX_SPAWNED_PER_SHARD == 1024);
const _: () = assert!(LEASE_DURATION == 100);

These prevent silent constant changes from invalidating test assumptions. If OP_LOG_CAP, MAX_SPAWNED_PER_SHARD, or LEASE_DURATION change, the conformance tests fail at compile time.

4. Tier 3 — Scenario Tests

File: crates/gossip-coordination/src/scenario_tests.rs Declared in: lib.rs (#[cfg(test)])

Multi-step workflow tests that exercise realistic end-to-end stories. Each scenario chains operations in production-realistic order, including state restoration across ownership transfers.

ID	Scenario	Core property
S1	Full run lifecycle	Baseline happy path: create_run → register → acquire → checkpoint×3 → complete (shard) → complete_run
S2	Lease expiry + reacquire	Cursor restoration across ownership transfer; stale writer fenced with `StaleFence`
S3	Split-replace + children	Parent becomes `Split`; children are independently acquirable and completable by different workers
S4	Double residual split chain	Iterative parent narrowing `[a,z)→[a,m)→[a,j)` with spawned list tracking; all three pieces complete
S5	Op-log eviction boundary	Evicted `OpId` treated as new (`Executed`); surviving `OpId` with wrong payload returns `OpIdConflict`
S6	Cancel from Initializing	Early run termination blocks registration; idempotent replay; terminal irreversibility
S7	Worker self-recovery	Same worker recovers from own lease expiry with cursor restoration and fence bump
S8	Claim contention	3 workers contend for 2 shards via `claim_next_available`; exactly 2 succeed, 1 gets `NoneAvailable`

5. Test Infrastructure

Test Fixtures (`crates/gossip-coordination/src/test_fixtures.rs`)

Shared factory functions consumed by all three coordination test modules.

Function	Returns
`test_tenant()`	`TenantId` from fixed bytes
`test_run()`	`RunId::from_raw(1)`
`test_shard()`	`ShardId::from_raw(10)`
`test_spec()`	`ShardSpec` with range `[a, z)`
`test_worker(id)`	`WorkerId::from_raw(id)`
`now(t)`	`LogicalTime::from_raw(t)`
`test_key()`	`ShardKey::new(test_run(), test_shard())`
`test_cursor(key)`	`CursorUpdate::new(key)`
`seeded_coordinator()`	`InMemoryCoordinator` with one run containing one shard `[a, z)`, `CursorSemantics::Completed`, already in Active status
`seeded_coordinator_with_semantics(s)`	Same as above but with the specified `CursorSemantics`
`test_split_replace_plan()`	Canonical `[a,m) + [m,z)` `SplitReplacePlan`
`test_split_residual_plan()`	Canonical `[a,m)` parent + `[m,z)` residual `SplitResidualPlan`
`acquire_shard(coord, t, worker_id)`	Shorthand for `acquire_and_restore`; returns the `Lease`
`checkpoint_ok(coord, t, lease, cursor_key, op_id)`	Fire-and-forget `checkpoint` using `test_tenant()`
`complete_ok(coord, t, lease, cursor_key, op_id)`	Fire-and-forget `complete` using `test_tenant()`
`park_ok(coord, t, lease, reason, op_id)`	Fire-and-forget `park_shard` using `test_tenant()`
`LEASE_DURATION`	`100` logical time ticks

seeded_coordinator() calls create_run and register_shards during construction, so the run is Active and the shard is ready for acquire. The register-shards op uses OpId::MAX to avoid collisions with test ops.

Tiger Style Assertions

Every conformance test asserts both the expected outcome and the absence of unexpected side effects. This pattern is called "Tiger Style" (after TigerBeetle's testing philosophy).

Example from fence_monotonicity_across_full_lifecycle:

// After checkpoint: verify cursor advanced AND fence did not change.
let rec = coord.shard_lookup(&test_tenant(), &test_key()).unwrap();
assert_eq!(rec.fence_epoch, f1, "checkpoint must not change fence");
assert_eq!(rec.status, ShardStatus::Active);

The positive assertion (fence_epoch == f1) proves the expected behavior. The status check proves no unexpected side effect (e.g., an accidental terminal transition). This catches regressions where satisfying one invariant accidentally violates another.

6. Tier 4 — Simulation Tests

Primary files: crates/gossip-coordination/src/sim/sim_behavioral_tests.rs, crates/gossip-coordination/src/sim/mega_sim_tests.rs, crates/gossip-coordination/src/sim/overload_tests.rs, crates/gossip-coordination/src/sim/proptest_state_machine_tests.rs, crates/gossip-coordination/src/sim/composition_tests.rs Declared in: crates/gossip-coordination/src/sim/mod.rs (#[cfg(test)])

Eight sub-tiers exercise the full simulation harness.

Test-support builds also include a composition harness for the coordination + done-ledger boundary. It reuses the same SimContext determinism model but adds a synthetic scan driver and cross-component provenance log on top of the core coordination harness.

Overload Scenario Tests (`crates/gossip-coordination/src/sim/overload_tests.rs`)

Targeted tests for the scripted overload path (run_overload). Fixed-seed tests cover each OverloadKind under different fault levels, a deterministic replay test verifies seed stability, a D1 accuracy test validates availability reporting, and a proptest sweeps random seeds/kinds/rounds for safety.

Test	Config	Assertions
`test_overload_burst_claim_sunny`	seed=42, SunnyDay, 10 rounds	No violations, L1 liveness
`test_overload_capacity_drop_stormy`	seed=77, Stormy, 8 rounds	No violations
`test_overload_burst_shards_radioactive` (`#[ignore]`)	seed=9, Radioactive, 12 rounds	No violations
`test_overload_deterministic_replay`	seed=123, Stormy, 12 rounds	Field-identical reports
`test_d1_accuracy_sunny`	seed=7, SunnyDay, 10 rounds	D1 reported == ground truth
`test_overload_zero_rounds_warmup_recovery_only`	seed=1, SunnyDay, 0 rounds	No violations, goodput == 0.0
`proptest_overload_safety`	50 cases, Stormy, 0-6 rounds	No violations

Behavioral Regression Tests (`sim_behavioral_tests.rs`)

Fixed-seed behavioral tests that pin safety properties without depending on exact PRNG-stream counts. A legitimate harness change that reorders random calls shifts counts but must not break behavioral assertions.

Test	Config	Assertions
`behavioral_seed_42_stormy`	seed=42, Stormy, 3 workers, 5 shards, 700 ops	No violations, converged, event coverage
`behavioral_seed_99_sunny`	seed=99, SunnyDay, 2 workers, 3 shards, 300 ops	No violations, converged, event coverage
`behavioral_seed_7_radioactive`	seed=7, Radioactive, 4 workers, 8 shards, 1500 ops	No violations (convergence not asserted under aggressive faults)
`deterministic_replay_cross_config`	Runs each config twice	Field-identical reports (`event_counts`, `ops_executed`, `end_time`)
`behavioral_cooldown_fires_under_simulation`	seed=42, SunnyDay, 8 workers, 20 shards, cooldown=99, 2000 ops	Cooldown throttling fires, `ClaimThrottled` event present

A compile-time const match block provides exhaustiveness enforcement: if a variant is added to SimEventKind without updating the match, the build fails. This replaces the former all_event_kinds_enumerated runtime test, which was strictly weaker.

Event coverage checks are fault-level-dependent: under SunnyDay with a small op budget, WorkerPaused/WorkerResumed may or may not appear depending on the PRNG sequence, so those are only required under Stormy/Radioactive where higher op counts make them reliable.

Mega Simulation Tests (`mega_sim_tests.rs`)

Thread-parallel seed sweep and proptest-based sweeper. Both are #[ignore] because they are too slow for the default cargo test cycle.

mega_sim_10k_steps — Divides seeds across available_parallelism() OS threads. Each seed runs 4 workers contending over 15 shards through 10K safety ops + 2K liveness ops. Assertions: zero invariant violations across all seeds; aggregate event-kind coverage includes the five core kinds. Failures include reproduction commands.

proptest_mega_sim — Delegates seed generation to proptest (100 cases by default), gaining automatic shrinking and .proptest-regressions file persistence. Same simulation config as mega_sim_10k_steps.

Stress Tests (`mega_sim_tests.rs`)

Scale-sensitive tests that push configurations well beyond the normal test suite. Both are #[ignore].

stress_200_shards_stormy — 8 workers, 200 shards, Stormy, 5 seeds, 5K safety + 2K liveness ops. At ~13x the shard count of the mega sweep, this test exercises two scale-sensitive paths: InvariantChecker pruning overhead as many shards reach terminal states, and worker contention with 8 workers competing for 200 shards.

stress_split_cascade — 4 workers, 20 shards, Radioactive, 10 seeds, 3K safety + 1K liveness ops. Radioactive mode's aggressive time-jumps and high lease-expiry rate create conditions for multi-level split cascades. Asserts at least one SplitReplaceOk event fires across all seeds, catching regressions in op-generation weights or split-plan construction. Invariants S1–S9 are validated even as the shard set grows dynamically from splits.

Convergence Proptests (`mega_sim_tests.rs`)

Bounded-liveness tests that complement the mega sweep's safety focus. The Alpern-Schneider decomposition (1985) states that every correctness property is the intersection of a safety property and a liveness property. The mega sweep covers safety (S1–S9); these tests cover the complementary liveness half — that every shard eventually reaches a terminal state. Both are #[ignore].

proptest_convergence_sunny — SunnyDay, 200 cases, 50 safety + 2000 liveness ops. Asserts both safety (no violations) and liveness (all shards terminal). Zero faults, but splits during the safety phase grow the shard count from the initial 15 to ~20-25, requiring proportionally more liveness ops for reliable acquire→complete cycles.

proptest_convergence_stormy — Stormy, 200 cases, 500 safety + 15000 liveness ops. Same dual assertion under ~10% fault pressure. Faults cause lease expiry mid-operation, forcing re-acquisition cycles that consume extra ops. The ~7.5× raw budget over SunnyDay compensates for the ~10% rejection rate and larger active shard set from the longer safety phase.

Multi-Tenant Isolation (`mega_sim_tests.rs`)

multi_tenant_isolation — Two tenants sharing one InMemoryCoordinator instance. This is the only test that exercises the cross-tenant rejection path (tenant-scoped lookup returning ShardNotFound) and tenant-scoped checker history. Runs in the default cargo test cycle (not #[ignore]).

The test verifies three properties:

Cross-tenant rejection: tenant B cannot acquire tenant A's shards (tenant-scoped lookup returns ShardNotFound).
Independent lifecycles: each tenant's shards progress through acquire → checkpoint → terminal independently, exercising both terminal paths (Complete for tenant A, Park for tenant B).
Checker isolation: the InvariantChecker uses (TenantId, RunId, ShardId) history keys, so tenant A's fence epoch history does not contaminate tenant B's S2 checks.

Regression Guard

zero_seed_count_does_not_panic — Protects against a chunks(0) panic when GOSSIP_SIM_SEEDS=0. The mega sweep's div_ceil and chunks() calls would panic unconditionally without the early-return guard; this test mirrors the arithmetic path to verify the guard remains effective.

Proptest State Machine Tests (`proptest_state_machine_tests.rs`)

Proptest-driven operation sequence tests that generate Vec<SimOp> via stateless strategies and feed them through CoordinationSim::step(). Unlike the mega-sim tests (which shrink only the seed), these tests shrink the operation sequence itself — when a bug is found, proptest minimizes to the smallest failing sequence of coordinator operations.

Operations are generated without state tracking. The harness handles rejected/skipped ops gracefully, and the invariant checker (S1–S9) runs after every step regardless of outcome. Expected rejection rate is ~45-55% for held-shard ops; rejections verify the coordinator's rejection path preserves safety.

Test	Config	Assertions
`prop_short_sequences_preserve_invariants`	SunnyDay, 256 cases, 5-50 ops	No violations (CI gate)
`prop_safety_under_random_ops` (`#[ignore]`)	Stormy, 200 cases, 5-200 ops	No violations
`prop_safety_across_fault_levels` (`#[ignore]`)	All levels, 100 cases, 5-200 ops	No violations

# CI-friendly (not ignored)
cargo test -p gossip-coordination prop_short_sequences_preserve_invariants

# Deep tests (ignored)
cargo test -p gossip-coordination proptest_state_machine -- --ignored --nocapture

For simulation architecture, the invariant table (S1–S9), determinism model, fault injection levels, and three-stage run model, see simulation-harness.md.

7. Running Tests

Quick test cycles

# All coordination protocol tests (unit + conformance + scenario)
cargo test -p gossip-coordination

# Data model tests (shard_spec, cursor, pooled)
cargo test -p gossip-contracts --lib coordination

# Individual tiers (module pattern matching)
cargo test -p gossip-coordination in_memory_tests
cargo test -p gossip-coordination conformance_tests
cargo test -p gossip-coordination scenario_tests

Overload tests

# Overload scenario tests (fast, runs in default cargo test)
cargo test -p gossip-coordination overload

# Including the slow radioactive test
cargo test -p gossip-coordination overload -- --ignored --nocapture

Simulation tests

# Behavioral regression tests (fast, runs in default cargo test)
cargo test --all-features -p gossip-coordination -- sim

# Mega simulation (slow, #[ignore])
cargo test -p gossip-coordination mega_sim -- --ignored --nocapture

# Convergence proptests (slow, #[ignore])
cargo test -p gossip-coordination proptest_convergence -- --ignored --nocapture

# Stress tests (slow, #[ignore])
cargo test -p gossip-coordination stress_ -- --ignored --nocapture

# Multi-tenant isolation (fast, runs in default cargo test)
cargo test -p gossip-coordination multi_tenant -- --nocapture

Environment variables

Variable	Purpose	Default
`GOSSIP_SIM_SEEDS`	Number of seeds in the parallel sweep	100
`GOSSIP_SIM_SEED`	Single seed for failure reproduction (bypasses sweep)	—
`GOSSIP_SIM_FAULT`	Fault level: `sunny`, `stormy`, `radioactive`	`stormy`

Reproducing failures

The mega sim prints reproduction commands for every failing seed:

GOSSIP_SIM_SEED=<seed> cargo test -p gossip-coordination mega_sim -- --ignored --nocapture

Or programmatically:

let report = CoordinationSim::new(FAILING_SEED, FaultLevel::Stormy)
    .with_workers_and_shards(4, 15)
    .run(10_000, 2_000);
assert!(report.violations.is_empty());

The simulation is fully deterministic: same seed, same fault level, same worker/shard counts, same op counts produce identical results on any platform.

8. Choosing the Right Tier

Signal	Tier
Testing a single backend operation in isolation	Tier 1 — Unit
Two or more invariants interact and must hold simultaneously	Tier 2 — Conformance
Multi-step workflow or end-to-end user story	Tier 3 — Scenario
Randomized large-scale validation or fault injection	Tier 4 — Simulation

When adding a new protocol feature:

Start with Tier 1 tests for each new operation or error path.
If the feature interacts with existing invariants, add Tier 2 tests that compose the relevant invariants.
If the feature changes user-visible workflow behavior, add a Tier 3 scenario.
Run the existing simulation suite to verify no regressions. If the feature adds new randomized behavior (e.g., a new fault type), update the simulation harness and add behavioral assertions.

9. Invariants Under Test

The simulation validates nine safety properties at every step. Unit, conformance, and scenario tests exercise these implicitly via ShardRecord::assert_invariants() (called on every mutation path), while the simulation validates them explicitly via InvariantChecker::check_all().

Label	Name	Rule
S1	MutualExclusion	At most one worker holds a non-expired lease per shard
S2	FenceMonotonicity	`fence_epoch` never decreases for a given `(RunId, ShardId)`
S3	TerminalIrreversibility	Terminal states (Done, Split, Parked) never revert to non-terminal
S4	RecordInvariant	`ShardRecord::assert_invariants()` does not panic
S5	CursorMonotonicity	`cursor.last_key()` never decreases per shard
S6	CursorBounds	Non-initial cursors remain within shard spec key range
S7	SplitCoverage	Split-parent's spawned children exist and reference the correct parent
S8	RunTerminalIrreversibility	Terminal run states never revert
S9	CooldownViolation	A worker must not claim twice within cooldown interval

Full invariant definitions and the checker implementation are in simulation-harness.md.

10. Source Files

File	Role
`crates/gossip-coordination/src/in_memory_tests.rs`	Tier 1: unit tests + proptest property tests
`crates/gossip-coordination/src/error_tests.rs`	Tier 1: error type Display/source/Debug coverage
`crates/gossip-coordination/src/record_tests.rs`	Tier 1: `ShardRecord`, `ShardStatus`, `ParkReason`, `OpLogEntry` tests
`crates/gossip-coordination/src/run_tests.rs`	Tier 1: run lifecycle types, validation, and state machine tests
`crates/gossip-coordination/src/facade_tests.rs`	Tier 1: `CoordinationFacade`, `ShardClaiming`, claim algorithm tests
`crates/gossip-coordination/src/session_tests.rs`	Tier 1: `WorkerSession` RAII handle and delegation tests
`crates/gossip-coordination/src/in_memory_filter_tests.rs`	Tier 1: `ShardFilter` predicates for `list_shards_into`
`crates/gossip-coordination/src/in_memory_run_tests.rs`	Tier 1: `RunManagement` trait implementation tests
`crates/gossip-coordination/src/conformance.rs`	Tier 2: backend-agnostic invariant-interaction harness (Groups A, B, C)
`crates/gossip-coordination/src/conformance_tests.rs`	Tier 2: InMemoryCoordinator binding + compile-time API guards
`crates/gossip-coordination/src/scenario_tests.rs`	Tier 3: multi-step end-to-end workflows (S1–S9)
`crates/gossip-coordination/src/test_fixtures.rs`	Shared factory functions and seeded coordinator setup
`crates/gossip-coordination/src/sim/sim_behavioral_tests.rs`	Tier 4a: fixed-seed behavioral regression + deterministic replay
`crates/gossip-coordination/src/sim/mega_sim_tests.rs`	Tier 4b: thread-parallel seed sweep + proptest sweeper
`crates/gossip-coordination/src/sim/proptest_state_machine_tests.rs`	Tier 4c: proptest state machine tests with per-op shrinking
`crates/gossip-coordination/src/sim/harness.rs`	Simulation driver (`run` + `run_overload`)
`crates/gossip-coordination/src/sim/harness_tests.rs`	Harness unit and property tests for bookkeeping and op generation
`crates/gossip-coordination/src/sim/invariants.rs`	External invariant checker (S1–S9)
`crates/gossip-coordination/src/sim/invariants_tests.rs`	Targeted unit tests for `InvariantChecker` edge cases
`crates/gossip-coordination/src/sim/overload.rs`	Overload scenario/reports and scripted op generators
`crates/gossip-coordination/src/sim/overload_tests.rs`	Overload scenario tests: per-kind, replay, D1 accuracy, proptest
`crates/gossip-coordination/src/sim/composition.rs`	Test-support composition harness coupling coordinator and done-ledger
`crates/gossip-coordination/src/sim/composition_invariants.rs`	Cross-component invariant checker for coordinator ↔ done-ledger provenance (C1–C4)
`crates/gossip-coordination/src/sim/composition_invariants_tests.rs`	Targeted unit tests for `CompositionInvariantChecker` and C1–C4 violations
`crates/gossip-coordination/src/sim/composition_tests.rs`	Four-tier composition test suite for the coordinator ↔ done-ledger boundary
`crates/gossip-coordination/src/sim/scan_driver_sim.rs`	Deterministic synthetic scan-outcome generator for composition runs
`crates/gossip-coordination/src/sim/shared.rs`	Internal helper aliases and utility functions shared by both harnesses
`crates/gossip-coordination/src/sim/backend.rs`	`SimIntrospection` and `SimulationBackend` trait definitions
`crates/gossip-coordination/src/sim/fault_injector.rs`	Fault-injecting introspector for invariant checker validation
`crates/gossip-coordination/src/sim/worker.rs`	Simulated worker bookkeeping
`crates/gossip-coordination/src/sim/test_util.rs`	Shared test helpers: record builder, proptest strategies
`crates/gossip-coordination/src/sim/mod.rs`	SimContext (PRNG + clock), FaultConfig, FaultLevel

11. Formal Verification (TLA+)

Tiers 1-4 verify the Rust implementation of the coordination protocol. TLA+ model checking verifies the protocol design -- an abstract model that exhaustively explores all reachable states to prove safety and liveness properties hold regardless of timing, interleaving, or worker behavior.

Invariant correspondence

TLA+ invariant	Sim label	Property
`MutualExclusion`	S1	At most one valid lease per shard
`AlwaysFenceMonotonicity`	S2	Fence epoch never decreases
`AlwaysTerminalIrreversibility`	S3	Done and Split never revert
`TerminalUnleased`	S4	Terminal shards hold no lease
`CursorMonotonicity`	S5	Cursor never decreases
`AlwaysCursorNonRegression`	S5	Cursor never decreases (action property)
--	S6	CursorBounds (byte-key ranges not modeled in TLA+)
`SplitAtomicity`	S7	Split parent has non-empty spawned set
`ChildImpliesParentSplit`	S7	Child non-NotCreated implies parent Split
`ZombieRejection`	--	Stale-epoch worker cannot hold valid lease
`Liveness` (INV-L01)	--	Active unleased shard eventually leased

Gaps are intentional: S6 (CursorBounds) depends on byte-key ranges not present in the abstract model; ZombieRejection and Liveness have no simulation label because they were introduced by the TLA+ spec.

Running the model checker

See specs/coordination/README.md for prerequisites, TLC commands, and mutation tests.

Source files

File	Purpose
`specs/coordination/ShardFencing.tla`	TLA+ specification
`specs/coordination/ShardFencing.cfg`	Production safety config
`specs/coordination/ShardFencing_liveness.cfg`	Liveness config
`specs/coordination/run_mutations.sh`	Mutation test suite (14 mutations)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coordination Testing Strategy

Allocation Policy Scope (Tiered)

1. Testing Pyramid

2. Tier 1 — Unit Tests

Coverage areas

3. Tier 2 — Conformance Tests

Group A: Cross-Cutting Invariant Interactions

Group B: Supplementary Coverage Tests

Group C: Run-Level Conformance

Compile-time guards

4. Tier 3 — Scenario Tests

5. Test Infrastructure

Test Fixtures (`crates/gossip-coordination/src/test_fixtures.rs`)

Tiger Style Assertions

6. Tier 4 — Simulation Tests

Overload Scenario Tests (`crates/gossip-coordination/src/sim/overload_tests.rs`)

Behavioral Regression Tests (`sim_behavioral_tests.rs`)

Mega Simulation Tests (`mega_sim_tests.rs`)

Stress Tests (`mega_sim_tests.rs`)

Convergence Proptests (`mega_sim_tests.rs`)

Multi-Tenant Isolation (`mega_sim_tests.rs`)

Regression Guard

Proptest State Machine Tests (`proptest_state_machine_tests.rs`)

7. Running Tests

Quick test cycles

Overload tests

Simulation tests

Environment variables

Reproducing failures

8. Choosing the Right Tier

9. Invariants Under Test

10. Source Files

11. Formal Verification (TLA+)

Invariant correspondence

Running the model checker

Source files

FilesExpand file tree

coordination-testing.md

Latest commit

History

coordination-testing.md

File metadata and controls

Coordination Testing Strategy

Allocation Policy Scope (Tiered)

1. Testing Pyramid

2. Tier 1 — Unit Tests

Coverage areas

3. Tier 2 — Conformance Tests

Group A: Cross-Cutting Invariant Interactions

Group B: Supplementary Coverage Tests

Group C: Run-Level Conformance

Compile-time guards

4. Tier 3 — Scenario Tests

5. Test Infrastructure

Test Fixtures (crates/gossip-coordination/src/test_fixtures.rs)

Tiger Style Assertions

6. Tier 4 — Simulation Tests

Overload Scenario Tests (crates/gossip-coordination/src/sim/overload_tests.rs)

Behavioral Regression Tests (sim_behavioral_tests.rs)

Mega Simulation Tests (mega_sim_tests.rs)

Stress Tests (mega_sim_tests.rs)

Convergence Proptests (mega_sim_tests.rs)

Multi-Tenant Isolation (mega_sim_tests.rs)

Regression Guard

Proptest State Machine Tests (proptest_state_machine_tests.rs)

7. Running Tests

Quick test cycles

Overload tests

Simulation tests

Environment variables

Reproducing failures

8. Choosing the Right Tier

9. Invariants Under Test

10. Source Files

11. Formal Verification (TLA+)

Invariant correspondence

Running the model checker

Source files

Test Fixtures (`crates/gossip-coordination/src/test_fixtures.rs`)

Overload Scenario Tests (`crates/gossip-coordination/src/sim/overload_tests.rs`)

Behavioral Regression Tests (`sim_behavioral_tests.rs`)

Mega Simulation Tests (`mega_sim_tests.rs`)

Stress Tests (`mega_sim_tests.rs`)

Convergence Proptests (`mega_sim_tests.rs`)

Multi-Tenant Isolation (`mega_sim_tests.rs`)

Proptest State Machine Tests (`proptest_state_machine_tests.rs`)