fix(inkless:consolidation): don't fence a consolidating leader below the seal#677
Conversation
There was a problem hiding this comment.
Pull request overview
This PR prevents a switched diskless leader (classic-to-diskless seal committed) from being fenced offline when LEO < seal if the topic can rebuild its classic prefix from remote storage (consolidating diskless with remote enabled). This ensures the ConsolidationReconciler can arm consolidation and recover a wiped leader instead of leaving the partition offline and causing KAFKA_STORAGE_ERROR on reads.
Changes:
- Update
ReplicaManager.maybeReconcileSwitchedLeaderto avoid fencing leaders below the seal when remote recovery is possible. - Extend the reconciliation docstring to document the new recovery path and why fencing blocks it.
- Add a unit test covering the “consolidating leader below seal with remote enabled stays online” case and adjust the test harness to correctly enable
UnifiedLog.remoteLogEnabled().
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| core/src/main/scala/kafka/server/ReplicaManager.scala | Changes switched-leader reconciliation behavior for LEO < seal to preserve availability for remote rebuild. |
| core/src/test/scala/unit/kafka/server/ReplicaManagerInklessTest.scala | Adds coverage for the new recovery behavior and fixes test setup so remote log is actually enabled in UnifiedLog. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7faa8c0 to
53c392e
Compare
…the seal maybeReconcileSwitchedLeader fenced a switched leader offline whenever its LEO fell below the committed seal. It runs during applyLocalLeadersDelta, before the ConsolidationReconciler, so the wiped-leader recovery from #673 never ran. The reconciler only sees online partitions, and a wiped consolidating leader (LEO 0 < seal) was offline before it could rebuild. Every fetch and offset request then failed with KAFKA_STORAGE_ERROR. For a consolidating diskless topic with remote storage enabled, [0, seal) is in the remote tier and can be rebuilt, so leave the partition online and let the reconciler arm consolidation at the current LEO. Non-consolidating topics still fence, since their classic prefix has no remote copy and LEO < seal is unrecoverable corruption. Co-authored-by: Cursor <cursoragent@cursor.com>
53c392e to
e526d64
Compare
jeqo
left a comment
There was a problem hiding this comment.
LGMT, just a small logging improvement
| s"classic-to-diskless start offset $seal; cannot catch up from another replica. " + | ||
| s"Marking the partition offline as its local log is corrupt below the committed seal.") | ||
| markPartitionOffline(tp) | ||
| if (isConsolidatingPartition(partition) && log.remoteLogEnabled()) { |
There was a problem hiding this comment.
personal note: this can be simplified when #678 lands -- not a blocker
maybeReconcileSwitchedLeader fenced a switched leader offline whenever its LEO fell below the committed seal. It runs during applyLocalLeadersDelta, before the ConsolidationReconciler, so the wiped-leader recovery from #673 never ran. The reconciler only sees online partitions, and a wiped consolidating leader (LEO 0 < seal) was offline before it could rebuild. Every fetch and offset request then failed with KAFKA_STORAGE_ERROR.
For a consolidating diskless topic with remote storage enabled, [0, seal) is in the remote tier and can be rebuilt, so leave the partition online and let the reconciler arm consolidation at the current LEO. Non-consolidating topics still fence, since their classic prefix has no remote copy and LEO < seal is unrecoverable corruption.
Co-authored-by: Cursor cursoragent@cursor.com