Skip to content

fix(inkless:consolidation): recover a switched consolidated leader after local-log loss#673

Merged
jeqo merged 2 commits into
mainfrom
svv/ts-unification-recover-leader
Jun 30, 2026
Merged

fix(inkless:consolidation): recover a switched consolidated leader after local-log loss#673
jeqo merged 2 commits into
mainfrom
svv/ts-unification-recover-leader

Conversation

@viktorsomogyi

Copy link
Copy Markdown
Contributor

A leader that comes up below the seal with an empty local log (a full local-storage wipe / disaster recovery where the controller, whose metadata survived, elects a replica that came back empty) has no peer to replicate the classic prefix from -- that prefix lives only in the remote tier. The ConsolidationReconciler now starts consolidation for such a leader (when remote storage is enabled) instead of retrying forever, so the fetcher lands below the diskless WAL start, hits OFFSET_MOVED_TO_TIERED_STORAGE, and rebuilds the leader-epoch cache and producer snapshot from remote.

@viktorsomogyi

Copy link
Copy Markdown
Contributor Author

Reopened #653 as it disappeared after a faulty rebase of the base branch.

@viktorsomogyi viktorsomogyi force-pushed the svv/ts-unification-consolidation-start-fix branch from b767198 to 7118217 Compare June 29, 2026 12:49
@viktorsomogyi viktorsomogyi force-pushed the svv/ts-unification-recover-leader branch from b225b4f to 5bb36f0 Compare June 29, 2026 12:55
Base automatically changed from svv/ts-unification-consolidation-start-fix to main June 29, 2026 13:56
…ter local-log loss

A leader that comes up below the seal with an empty local log (a full
local-storage wipe / disaster recovery where the controller, whose metadata
survived, elects a replica that came back empty) has no peer to replicate the
classic prefix from -- that prefix lives only in the remote tier. The
ConsolidationReconciler now starts consolidation for such a leader (when remote
storage is enabled) instead of retrying forever, so the fetcher lands below the
diskless WAL start, hits OFFSET_MOVED_TO_TIERED_STORAGE, and rebuilds the
leader-epoch cache and producer snapshot from remote.

Co-authored-by: Cursor <cursoragent@cursor.com>
@viktorsomogyi viktorsomogyi force-pushed the svv/ts-unification-recover-leader branch 2 times, most recently from c551884 to 89f76e2 Compare June 29, 2026 15:53
jeqo
jeqo previously approved these changes Jun 29, 2026

@jeqo jeqo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a couple of minor comments to improve observability and align test case names.

Comment on lines +90 to +91
val isConsolidating = inklessMetadataView.isConsolidatingDisklessTopic(tp.topic) ||
(inklessMetadataView.isDisklessTopic(tp.topic) && partition.log.exists(_.remoteLogEnabled()))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note for future work: this can be collapsed to isDisklessTopic once the switch enables remote.storage.enable atomically as suggested here

@jeqo jeqo merged commit dfd7275 into main Jun 30, 2026
5 checks passed
@jeqo jeqo deleted the svv/ts-unification-recover-leader branch June 30, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants