Skip to content

Sort lineage entries by timestamp then id for deterministic list output#18859

Draft
spapin wants to merge 1 commit into
apache:masterfrom
spapin:fix/segment-lineage-ordering
Draft

Sort lineage entries by timestamp then id for deterministic list output#18859
spapin wants to merge 1 commit into
apache:masterfrom
spapin:fix/segment-lineage-ordering

Conversation

@spapin

@spapin spapin commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Problem

SegmentLineage.toJsonObject() sorted entries by millisecond timestamp only, over a HashMap. Entries sharing a millisecond fell back to arbitrary UUID-hash iteration order, so the list API output was non-deterministic. testListSegmentLineage asserts two back-to-back startReplaceSegments entries appear in a fixed order; on a fast machine the two share a millisecond ~57% of the time, and the assertion failed ~29% of runs (measured over 1000 runs).

Fix

Add the entry id as a sort tiebreaker: entries are ordered by (timestamp, then id). This makes the list output deterministic for a given set of entries. Entries remain identified by UUID. The code generally doesn't make any use of the ordering, but segmentLineage prints a summary where segments at the same millisecond may appear out of order.

The test derives the expected order with the same (timestamp, id) sort and asserts the response lists the two entries in that order — deterministic regardless of whether the timestamps tie.

@spapin

spapin commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

The biggest benefit for this PR is to fix the flaky test on fast machines. This isn't an impactful bug otherwise.

@codecov-commenter

codecov-commenter commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.82%. Comparing base (bec3dcc) to head (0dd64ac).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18859      +/-   ##
============================================
+ Coverage     57.00%   64.82%   +7.81%     
- Complexity        7     1322    +1315     
============================================
  Files          2607     3393     +786     
  Lines        152679   211308   +58629     
  Branches      24816    33226    +8410     
============================================
+ Hits          87041   136971   +49930     
- Misses        58227    63277    +5050     
- Partials       7411    11060    +3649     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.82% <100.00%> (+7.81%) ⬆️
temurin 64.82% <100.00%> (+7.81%) ⬆️
unittests 64.81% <100.00%> (+7.81%) ⬆️
unittests1 57.00% <0.00%> (-0.01%) ⬇️
unittests2 37.19% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@spapin spapin force-pushed the fix/segment-lineage-ordering branch 7 times, most recently from 51b27a4 to 4b7ef85 Compare June 27, 2026 14:56
SegmentLineage.toJsonObject() sorted entries by millisecond timestamp only, over a
HashMap, so entries sharing a millisecond came out in arbitrary (UUID-hash) order
-- non-deterministic list output, and a ~29% flake in testListSegmentLineage (two
back-to-back replaces share a millisecond ~57% of the time on a fast machine).

Add the entry id as a sort tiebreaker so the output is deterministic for a given
set of entries: (timestamp, then id). No change to id generation -- entries remain
identified by UUID. testListSegmentLineage derives the expected order with the same
(timestamp, id) sort and asserts the response matches.
@spapin spapin force-pushed the fix/segment-lineage-ordering branch from 4b7ef85 to 0dd64ac Compare June 27, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants