Summary
When PX4 is receiving a mission upload and stops getting the next MISSION_ITEM, the GETLIST retry branch can keep resending MISSION_REQUEST often enough that the protocol timeout handler never runs.
The result is that _transfer_in_progress can remain set until reboot, and later MISSION_COUNT messages are rejected with MAV_MISSION_ERROR.
Observed behavior
- Mission upload stalls after PX4 has already entered
MAVLINK_WPM_STATE_GETLIST.
- PX4 keeps retrying
MISSION_REQUEST.
- The mission protocol never reaches the timeout path that calls
switch_to_idle_state() and clears _transfer_in_progress.
- Subsequent upload attempts are rejected in
handle_mission_count() because _transfer_in_progress is still true.
- Reboot clears the problem immediately.
Why this looks like a real bug
In affected trees, the timeout logic is ordered like this:
if (_state == MAVLINK_WPM_STATE_GETLIST && retry_timeout_expired) {
send_mission_request(...);
} else if (_state != MAVLINK_WPM_STATE_IDLE && protocol_timeout_expired) {
switch_to_idle_state();
_transfer_in_progress = false;
}
If no new mission item is received, _time_last_recv keeps aging toward the protocol timeout, but the GETLIST retry branch keeps matching first every retry interval. That prevents the protocol-timeout branch from executing.
Reproduction outline
- Start a mission upload so PX4 enters
MAVLINK_WPM_STATE_GETLIST.
- Let PX4 receive the initial mission metadata and at least one request transition into GETLIST.
- Drop or stop the next expected
MISSION_ITEM from the uploader.
- Wait longer than
MAVLINK_MISSION_PROTOCOL_TIMEOUT_DEFAULT.
- Observe that PX4 keeps resending
MISSION_REQUEST and does not clear _transfer_in_progress.
- Send a fresh
MISSION_COUNT.
- Observe that PX4 rejects it because
_transfer_in_progress is still set.
Scope
This is closely related to older mission-transfer timeout bugs, but this report is specifically about retry ordering preventing the timeout cleanup from ever running.
Current upstream main may need the same ordering review for any resend branch that sits ahead of protocol-timeout handling, including the newer SENDLIST resend path for dropped MISSION_COUNT.
Proposed fix
Handle protocol timeout before retry resends:
if (_state != MAVLINK_WPM_STATE_IDLE && protocol_timeout_expired) {
switch_to_idle_state();
_transfer_in_progress = false;
} else if (_state == MAVLINK_WPM_STATE_GETLIST && retry_timeout_expired) {
send_mission_request(...);
}
That preserves normal retry behavior before the timeout window, but once the protocol has been silent too long, timeout cleanup wins and PX4 can accept a fresh upload again.
Field impact
On low-bandwidth or lossy links, this can leave the vehicle requiring a reboot to recover mission upload capability even though the protocol already has timeout cleanup logic intended to recover from exactly this kind of stall.
Summary
When PX4 is receiving a mission upload and stops getting the next
MISSION_ITEM, the GETLIST retry branch can keep resendingMISSION_REQUESToften enough that the protocol timeout handler never runs.The result is that
_transfer_in_progresscan remain set until reboot, and laterMISSION_COUNTmessages are rejected withMAV_MISSION_ERROR.Observed behavior
MAVLINK_WPM_STATE_GETLIST.MISSION_REQUEST.switch_to_idle_state()and clears_transfer_in_progress.handle_mission_count()because_transfer_in_progressis still true.Why this looks like a real bug
In affected trees, the timeout logic is ordered like this:
If no new mission item is received,
_time_last_recvkeeps aging toward the protocol timeout, but the GETLIST retry branch keeps matching first every retry interval. That prevents the protocol-timeout branch from executing.Reproduction outline
MAVLINK_WPM_STATE_GETLIST.MISSION_ITEMfrom the uploader.MAVLINK_MISSION_PROTOCOL_TIMEOUT_DEFAULT.MISSION_REQUESTand does not clear_transfer_in_progress.MISSION_COUNT._transfer_in_progressis still set.Scope
This is closely related to older mission-transfer timeout bugs, but this report is specifically about retry ordering preventing the timeout cleanup from ever running.
Current upstream
mainmay need the same ordering review for any resend branch that sits ahead of protocol-timeout handling, including the newer SENDLIST resend path for droppedMISSION_COUNT.Proposed fix
Handle protocol timeout before retry resends:
That preserves normal retry behavior before the timeout window, but once the protocol has been silent too long, timeout cleanup wins and PX4 can accept a fresh upload again.
Field impact
On low-bandwidth or lossy links, this can leave the vehicle requiring a reboot to recover mission upload capability even though the protocol already has timeout cleanup logic intended to recover from exactly this kind of stall.