Skip to content

Enhance KissModem frame processing and timeout handling#2490

Open
tuzzmaniandevil wants to merge 1 commit intomeshcore-dev:devfrom
tuzzmaniandevil:dev
Open

Enhance KissModem frame processing and timeout handling#2490
tuzzmaniandevil wants to merge 1 commit intomeshcore-dev:devfrom
tuzzmaniandevil:dev

Conversation

@tuzzmaniandevil
Copy link
Copy Markdown

Fixes several bugs in the KISS modem TX state machine that could cause the modem to permanently stop sending packets over serial.

Problem

The KISS modem TX state machine had multiple paths that could lock up permanently, requiring a device reboot:

  1. TX_SENDING stuck forever — If isSendComplete() never returns true (missed radio interrupt, SPI glitch), the state machine stays in TX_SENDING indefinitely. No more packets can be sent or queued.
  2. startSendRaw() return value ignored — If the radio fails to start transmitting, the modem enters TX_SENDING waiting for a completion that never started.
  3. TX_WAIT_CLEAR stuck forever — If the radio gets stuck detecting a phantom carrier, isReceiving() returns true indefinitely and the state machine never progresses.
  4. Silent packet drops — When a DATA frame arrives while a TX is already pending, it's silently discarded with no feedback to the host application, which may hang waiting for a TX completion that will never come.

Changes

  • Dynamic TX timeout — Added a timeout to TX_SENDING using getEstAirtimeFor() * 1.5, matching the approach used by the Dispatcher in companion/repeater firmware. Adapts automatically to radio configuration instead of using a fixed 10s constant.
  • startSendRaw() error handling — Check return value; on failure, drop the packet and return to TX_IDLE instead of entering a dead state.
  • TX_WAIT_CLEAR timeout — If the channel stays busy longer than the worst-case max packet airtime × 1.5, force-proceed to TX_DELAY.
  • TX_SLOT_WAIT timer reset — Reset _tx_timer when cycling back to TX_WAIT_CLEAR so the channel-busy timeout measures time in that state, not cumulative time since the TX was queued.
  • TX-busy rejection — When a DATA frame is received while a TX is already pending, respond with HW_RESP_TX_DONE (result=0x00) so the host knows the packet was rejected.
  • TX failure notification — On TX timeout, notify the host with HW_RESP_TX_DONE (result=0x00) instead of silently dropping.

Testing

These are all state machine edge cases triggered by radio hardware faults (missed interrupts, stuck carrier detect). Verified by code review against the Dispatcher's equivalent timeout handling in src/Dispatcher.cpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant