Skip to content

Enforce validDocIds consensus in upsert task generators and add includeBitmaps to validDocIdsMetadata API#18853

Open
deepthi912 wants to merge 5 commits into
apache:masterfrom
deepthi912:includeBitmapsInValidDocIdsMetadata
Open

Enforce validDocIds consensus in upsert task generators and add includeBitmaps to validDocIdsMetadata API#18853
deepthi912 wants to merge 5 commits into
apache:masterfrom
deepthi912:includeBitmapsInValidDocIdsMetadata

Conversation

@deepthi912

@deepthi912 deepthi912 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Make the upsert task generators reject segments with inconsistent replicas before scheduling, instead of relying on the executor to fail those tasks after the fact. Concretely, this PR:

  1. Adds an opt-in includeBitmaps flag to the validDocIdsMetadata endpoint so the controller can batch-fetch each replica's validDocIds bitmap in one call.
  2. Applies the executor's validDocIds consensus checks (CRC match, server health, and EQUAL / UNSAFE / MOST_VALID_DOCS) in UpsertCompactionTaskGenerator and UpsertCompactMergeTaskGenerator, skipping any segment whose replicas disagree.
  3. Adds a validDocIdsValidationMode config (STRICT default / EXECUTOR_ONLY) to toggle the generator-side checks.

Follow-up to #17696 (which added the executor-side consensus), per this review comment.

Background

#17696 made the upsert compaction executor reconcile validDocIds across replicas before compacting — it checks CRC, server health, and a configurable validDocIdsConsensusMode (UNSAFE / EQUAL / MOST_VALID_DOCS), failing the task rather than letting a less-complete replica overwrite a more-complete one.

The generator, however, still scheduled a task for every eligible segment, even when its replicas were inconsistent (a CRC mismatch mid-reload, an unhealthy server, or divergent validDocIds). The executor then has to fail those tasks to stay safe — which wastes a segment download and a task slot every cycle, and the same segment keeps getting re-picked and re-failed.

What this PR does

  1. includeBitmaps option on the validDocIdsMetadata endpoint. POST /tables/{tableNameWithType}/validDocIdsMetadata?includeBitmaps=true returns the serialized validDocIds bitmap in each per-segment-replica entry, so the controller can compare replica bitmaps in one batched call instead of a request per segment-replica. Off by default; response unchanged when not requested (@JsonInclude(NON_NULL)), backward compatible.

  2. Generator-side consensus in UpsertCompactionTaskGenerator and UpsertCompactMergeTaskGenerator — the same CRC / server-health / validDocIdsConsensusMode checks the executor uses, skipping any segment whose replicas disagree. Bitmaps are fetched only for EQUAL; strict modes also require responses from all assigned replicas.

  3. New validDocIdsValidationMode config: STRICT (default) enforces in both generator and executor; EXECUTOR_ONLY keeps the prior executor-only behavior.

Backward compatibility

validDocIdsValidationMode defaults to STRICT, so generators now enforce consensus by default; set EXECUTOR_ONLY to restore the old behavior. The endpoint change is additive/opt-in.

Testing

Unit tests cover each mode through the generators — EQUAL (agree / disagree / replica missing / CRC mismatch / unhealthy / missing bitmap), UNSAFE, MOST_VALID_DOCS — plus the config parsing/resolution helpers.

@codecov-commenter

codecov-commenter commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 56.95364% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.79%. Comparing base (dc4e957) to head (bb8e52f).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...che/pinot/plugin/minion/tasks/MinionTaskUtils.java 70.00% 15 Missing and 6 partials ⚠️
...tcompactmerge/UpsertCompactMergeTaskGenerator.java 51.61% 13 Missing and 2 partials ⚠️
...t/controller/util/ServerSegmentMetadataReader.java 0.00% 11 Missing ⚠️
...psertcompaction/UpsertCompactionTaskGenerator.java 60.71% 10 Missing and 1 partial ⚠️
...mon/restlet/resources/ValidDocIdsMetadataInfo.java 0.00% 5 Missing ⚠️
.../org/apache/pinot/core/common/MinionConstants.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18853      +/-   ##
============================================
- Coverage     64.81%   64.79%   -0.02%     
- Complexity     1322     1346      +24     
============================================
  Files          3393     3393              
  Lines        211246   211401     +155     
  Branches      33208    33250      +42     
============================================
+ Hits         136917   136980      +63     
- Misses        63284    63362      +78     
- Partials      11045    11059      +14     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.79% <56.95%> (-0.02%) ⬇️
temurin 64.79% <56.95%> (-0.02%) ⬇️
unittests 64.79% <56.95%> (-0.02%) ⬇️
unittests1 57.00% <0.00%> (-0.02%) ⬇️
unittests2 37.17% <56.95%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@deepthi912 deepthi912 changed the title Add includeBitmaps option to validDocIdsMetadata endpoint Enforce validDocIds consensus in upsert task generators and add includeBitmaps to validDocIdsMetadata API Jun 26, 2026
@deepthi912 deepthi912 force-pushed the includeBitmapsInValidDocIdsMetadata branch from ef91178 to 8571ed6 Compare June 26, 2026 17:08
Mirror the executor's validDocIds enforcement at generation time so inconsistent
segments are never scheduled. UpsertCompactionTaskGenerator and
UpsertCompactMergeTaskGenerator now validate each segment's replicas (CRC match,
server health, and EQUAL/UNSAFE/MOST_VALID_DOCS consensus) via the shared
MinionTaskUtils.selectValidDocIdsMetadataForConsensus, requesting includeBitmaps
only for EQUAL and requiring all assigned replicas to respond for the strict
modes.

A new validDocIdsValidationMode config (STRICT default, EXECUTOR_ONLY) gates the
generator-side checks: STRICT runs them in both generator and executor;
EXECUTOR_ONLY downgrades the generator to a lenient pick and leaves the executor
as the sole gate.
@deepthi912 deepthi912 force-pushed the includeBitmapsInValidDocIdsMetadata branch from 8571ed6 to cac3f5e Compare June 26, 2026 17:45
@deepthi912 deepthi912 added upsert Related to upsert functionality configuration Config changes (addition/deletion/change in behavior) labels Jun 26, 2026

@xiangfu0 xiangfu0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one high-signal issue; see inline comment.

if (consensusMode == MinionConstants.ValidDocIdsConsensusMode.EQUAL) {
ValidDocIdsMetadataInfo first = usableReplicas.get(0);
RoaringBitmap consensusBitmap = deserializeBitmapOrNull(first);
if (consensusBitmap == null) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes controller-first rolling upgrades incompatible under the new default STRICT/EQUAL path. Older servers still answer this endpoint but omit bitmap, and ValidDocIdsMetadataInfo explicitly treats that as expected for old servers; here we convert that mixed-version response into a hard skip, so upsert compaction/compact-merge task generation stops until every server is upgraded. Please keep the generator default executor-only, or add an old-server fallback before making bitmap-based prescheduling the default.

@deepthi912 deepthi912 added the backward-incompat Introduces a backward-incompatible API or behavior change label Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backward-incompat Introduces a backward-incompatible API or behavior change configuration Config changes (addition/deletion/change in behavior) upsert Related to upsert functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants