Commit d9046e6
fix: raft HA production hardening — leader fencing, log compaction, election timeout, audit log (#3230)
* Fix raft leader handoff regression after SIGTERM
* fix: follower crash on restart when EVM is ahead of stale raft snapshot
Bug A: RecoverFromRaft crashed with "invalid block height" when the node
restarted after SIGTERM and the EVM state (persisted before kill) was ahead
of the raft FSM snapshot (which hadn't finished log replay yet). The function
now verifies the hash of the local block at raftState.Height — if it matches
the snapshot hash the EVM history is correct and recovery is safely skipped;
a mismatch returns an error indicating a genuine fork.
Bug B: waitForMsgsLanded used two repeating tickers with the same effective
period (SendTimeout/2 poll, SendTimeout timeout), so both could fire
simultaneously in select and the timeout would win even when AppliedIndex >=
CommitIndex. Replaced the deadline ticker with a one-shot time.NewTimer,
added a final check in the deadline branch, and reduced poll interval to
min(50ms, timeout/4) for more responsive detection.
Fixes the crash-restart Docker backoff loop observed in SIGTERM HA test
cycle 7 (poc-ha-2 never rejoining within the 300s kill interval).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): guard FSM apply callback with RWMutex to prevent data race
SetApplyCallback(nil) called from raftRetriever.Stop() raced with
FSM.Apply reading applyCh: wg.Wait() only ensures the consumer goroutine
exited, but the raft library can still invoke Apply concurrently.
Add applyMu sync.RWMutex to FSM; take write lock in SetApplyCallback and
read lock in Apply before reading the channel pointer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(raft): add ResignLeader() public method on Node
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(node): implement LeaderResigner interface on FullNode
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(shutdown): resign raft leadership before cancelling context on SIGTERM
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(config): add election_timeout, snapshot_threshold, trailing_logs to RaftConfig; fix SnapCount default 0→3
Add three new Raft config parameters:
- ElectionTimeout: timeout for candidate to wait for votes (defaults to 1s)
- SnapshotThreshold: outstanding log entries that trigger snapshot (defaults to 500)
- TrailingLogs: log entries to retain after snapshot (defaults to 200)
Fix critical default: SnapCount was 0 (broken, retains no snapshots) → 3
This enables control over Raft's snapshot frequency and recovery behavior to prevent
resync debt from accumulating unbounded during normal operation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): wire snapshot_threshold, trailing_logs, election_timeout into hashicorp/raft config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(raft): annotate FSM apply log and RaftApplyMsg with raft term for block provenance audit
Add Term field to RaftApplyMsg struct to track the raft term in which each
block was committed. Update FSM.Apply() debug logging to include both
raft_term and raft_index fields alongside block height and hash. This
enables better audit trails and debugging of replication issues.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(ci): fix gci comment alignment in defaults.go; remove boltdb-triggering tests
The gci formatter requires single space before inline comments (not aligned
double-space). Also removed TestNodeResignLeader_NotLeaderNoop and
TestNewNode_SnapshotConfigApplied which create real boltdb-backed raft nodes:
boltdb@v1.3.1 has an unsafe pointer alignment issue that panics under
Go 1.25's -checkptr. The nil-receiver test (TestNodeResignLeader_NilNoop)
is retained as it exercises the same guard without touching boltdb.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): suppress boltdb 'Rollback failed: tx closed' log noise
hashicorp/raft-boltdb uses defer tx.Rollback() as a safety net on every
transaction. When Commit() succeeds, the deferred Rollback() returns
bolt.ErrTxClosed and raft-boltdb logs it as an error — even though it is
the expected outcome of every successful read or write. The message has no
actionable meaning and floods logs at high block rates.
Add a one-time stdlib log filter (sync.Once in NewNode) that silently drops
lines containing 'tx closed' and forwards everything else to stderr.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): address PR review — shutdown wiring, error logging, snap docs, tests
- Call raftRetriever.Stop() in Syncer.Stop() so SetApplyCallback(nil) is
actually reached and the goroutine is awaited before wg.Wait()
- Log leadershipTransfer error at warn level in Node.Stop() instead of
discarding it silently
- Fix SnapCount comments in config.go: it retains snapshot files on disk
(NewFileSnapshotStore retain param), not log-entry frequency
- Extract buildRaftConfig helper from NewNode to enable config wiring tests
- Add TestNodeResignLeader_NotLeaderNoop (non-nil node, nil raft → noop)
- Add TestNewNode_SnapshotConfigApplied (table-driven, verifies
SnapshotThreshold and TrailingLogs wiring with custom and zero values)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): address code review issues — ShutdownTimeout, resign fence, election validation
- Add ShutdownTimeout field (default 5s) to raft Config so Stop() drains
committed logs with a meaningful timeout instead of the 200ms SendTimeout
- Wrap ResignLeader() in a 3s goroutine+select fence in the SIGTERM handler
so a hung leadership transfer cannot block graceful shutdown indefinitely
- Validate ElectionTimeout >= HeartbeatTimeout in RaftConfig.Validate() to
prevent hashicorp/raft panicking at startup with an invalid config
- Fix stale "NewNode creates a new raft node" comment that had migrated onto
buildRaftConfig after the function was extracted
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style(raft): fix gci struct field alignment in node_test.go
gofmt/gci requires minimal alignment; excessive spaces in the
TestNewNode_SnapshotConfigApplied struct literal caused a lint failure.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test: improve patch coverage for raft shutdown and resign paths
Add unit tests for lines flagged by Codecov:
- boltTxClosedFilter.Write: filter drops "tx closed", forwards others
- buildRaftConfig: ElectionTimeout > 0 applied, zero uses default
- FullNode.ResignLeader: nil raftNode no-op; non-leader raftNode no-op
- Syncer.Stop: raftRetriever.Stop is called when raftRetriever is set
- Syncer.RecoverFromRaft: GetHeader failure when local state is ahead of
stale raft snapshot returns "cannot verify hash" error
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(config): reject negative ElectionTimeout in RaftConfig.Validate
A negative ElectionTimeout was silently ignored (buildRaftConfig only
applies values > 0), allowing a misconfigured node to start with the
library default instead of failing fast. Add an explicit < 0 check
that returns an error; 0 remains valid as the "use library default"
sentinel.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): preserve stdlib logger writer in bolt filter; propagate ctx through ResignLeader
- suppressBoltNoise.Do now wraps log.Writer() instead of os.Stderr so any
existing stdlib logger redirection is preserved rather than clobbered
- ResignLeader now accepts a context.Context: leadershipTransfer runs in a
goroutine and a select abandons the caller at ctx.Done(), returning
ctx.Err(); the goroutine itself exits once the inner raft transfer
completes (bounded by ElectionTimeout)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(node): propagate context through LeaderResigner.ResignLeader interface
- LeaderResigner.ResignLeader() → ResignLeader(ctx context.Context) error
- FullNode.ResignLeader passes ctx down to raft.Node.ResignLeader
- run_node.go calls resigner.ResignLeader(resignCtx) directly — no wrapper
goroutine/select needed; context.DeadlineExceeded vs other errors are
logged distinctly
- Merge TestFullNode_ResignLeader_NilRaftNode and
TestFullNode_ResignLeader_NonLeaderRaftNode into single table-driven test
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): abdicate leadership when store is significantly behind raft state
When a node wins election but its local store is more than 1 block behind
the raft FSM state, RecoverFromRaft cannot replay the gap (it only handles
the single latest block in the raft snapshot). Previously the node would
log "recovery successful" and start leader operations anyway, then stall
block production while the executor repeatedly failed to sync the missing
blocks from a store that did not have them.
Fix: in DynamicLeaderElection.Run, detect diff < -1 at the
follower→leader transition and immediately transfer leadership to a
better-synced peer. diff == -1 is preserved: RecoverFromRaft can apply
exactly one block from the raft snapshot, so that path is unchanged.
Closes #3255
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): address julienrbrt review — logger, boltdb filter, ShutdownTimeout
- Remove stdlib log filter (boltTxClosedFilter / suppressBoltNoise): it
redirected the global stdlib logger which is the wrong scope. raft-boltdb
uses log.Printf directly with no Logger option, so the "tx closed" noise
is now accepted as-is from stderr.
- Wire hashicorp/raft's internal hclog output through zerolog: set
raft.Config.Logger to an hclog.Logger backed by the zerolog writer so
all raft-internal messages appear in the structured log stream under
component=raft-hashicorp.
- Remove ShutdownTimeout from public config: it was a second "how long to
wait" knob that confused operators. ShutdownTimeout is now computed
internally as 5 × SendTimeout at the initRaftNode call site.
- Delete TestRaftRetrieverStopClearsApplyCallback: tested an implementation
detail (Stop clears the apply callback pointer) rather than observable
behaviour. The stubRaftNode helper it defined is moved to syncer_test.go
where it is still needed.
- Rename TestNewNode_SnapshotConfigApplied → TestBuildRaftConfig_SnapshotConfigApplied
to reflect that it tests buildRaftConfig, not NewNode.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(ci): promote go-hclog to direct dep; fix gci alignment in syncer_test
go mod tidy promotes github.com/hashicorp/go-hclog from indirect to
direct now that pkg/raft/node.go imports it explicitly.
gci auto-formatted stubRaftNode method stubs in syncer_test.go.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): address coderabbitai feedback — ShutdownTimeout clamp, transfer error propagation, deterministic test
ShutdownTimeout zero-value panic (critical):
NewNode now clamps ShutdownTimeout to 5*SendTimeout when the caller passes
zero, preventing a panic in time.NewTicker inside waitForMsgsLanded. The
normal path through initRaftNode already sets it explicitly; this guard
protects direct callers (e.g. tests) that omit the field.
Leadership transfer error propagation (major):
When store-lag abdication calls leadershipTransfer() and it fails, the
error is now returned instead of being logged and silently continuing.
Continuing after a failed transfer left the node as leader-without-worker,
stalling the cluster.
Deterministic abdication test (major):
Replace time.Sleep(10ms) + t.Fatal-in-goroutine with channel-based
synchronization: leader runFn signals leaderStarted; the test goroutine
waits up to 50ms for that signal and calls t.Error (safe from goroutines)
if it arrives, then cancels the context either way.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(changelog): add unreleased entries for raft HA hardening (#3230)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): wait for block-store sync before abdicating on leader election
When all nodes restart simultaneously their block stores can lag behind
the raft FSM height (block data arrives via p2p, not raft). With the
previous code every elected node saw diff < -1 and immediately called
leadershipTransfer(), creating an infinite hot-potato: no node ever
stabilised as leader and block production stalled.
Instead of abdicating immediately, the new waitForBlockStoreSync helper
polls IsSynced for up to ShutdownTimeout (default ~1s). The fastest-
syncing peer proceeds as leader; nodes that cannot catch up in time still
abdicate and yield to a better candidate. Leadership also checks mid-wait
so a lost-leadership event aborts the wait early.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): distinguish sync wait outcomes with syncResult enum
waitForBlockStoreSync previously returned bool, conflating three distinct
failure modes (ctx canceled, timeout, lost leadership). The caller in Run
then unconditionally called leadershipTransfer() on any false return, which
is wrong when leadership was already lost.
Introduce a syncResult enum (syncResultSynced, syncResultTimeout,
syncResultLostLeadership, syncResultCanceled) and update Run to handle each
case correctly:
- syncResultCanceled → return ctx.Err()
- syncResultLostLeadership → continue without calling leadershipTransfer()
- syncResultTimeout → leadershipTransfer() + continue as before
- syncResultSynced → refresh raftState/diff and proceed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(raft): fix gci alignment in syncResult const block
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 4c7323f commit d9046e6
17 files changed
Lines changed: 695 additions & 22 deletions
File tree
- block/internal/syncing
- node
- pkg
- cmd
- config
- raft
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
12 | 16 | | |
13 | 17 | | |
14 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| 77 | + | |
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
247 | 247 | | |
248 | 248 | | |
249 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
250 | 255 | | |
251 | 256 | | |
252 | 257 | | |
| |||
1240 | 1245 | | |
1241 | 1246 | | |
1242 | 1247 | | |
1243 | | - | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
| 1266 | + | |
| 1267 | + | |
1244 | 1268 | | |
1245 | 1269 | | |
1246 | 1270 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
38 | 63 | | |
39 | 64 | | |
40 | 65 | | |
| |||
422 | 447 | | |
423 | 448 | | |
424 | 449 | | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
425 | 615 | | |
426 | 616 | | |
427 | 617 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
102 | 103 | | |
103 | 104 | | |
104 | 105 | | |
105 | | - | |
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
154 | 155 | | |
155 | 156 | | |
156 | 157 | | |
| 158 | + | |
157 | 159 | | |
158 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
159 | 164 | | |
160 | 165 | | |
161 | 166 | | |
| |||
384 | 389 | | |
385 | 390 | | |
386 | 391 | | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
82 | 83 | | |
83 | 84 | | |
84 | 85 | | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
| 5 | + | |
4 | 6 | | |
5 | 7 | | |
6 | 8 | | |
| |||
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
24 | 33 | | |
25 | 34 | | |
26 | 35 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
224 | 224 | | |
225 | 225 | | |
226 | 226 | | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
227 | 243 | | |
228 | 244 | | |
229 | 245 | | |
| |||
0 commit comments