Skip to content

Add ranged-read support to PinotFS#18861

Open
davecromberge wants to merge 1 commit into
apache:masterfrom
permutive-engineering:feature-contrib/pinotfs-ranged-read
Open

Add ranged-read support to PinotFS#18861
davecromberge wants to merge 1 commit into
apache:masterfrom
permutive-engineering:feature-contrib/pinotfs-ranged-read

Conversation

@davecromberge

Copy link
Copy Markdown
Member

Description

Adds an additive ranged-read capability to the PinotFS SPI so callers can read a specific byte range of a file without downloading the whole object.

  • New default methods openForRead(URI uri, long offset, long length) and supportsRangedRead() on PinotFS. Defaults throw UnsupportedOperationException / return false, so all existing
    implementations remain source- and binary-compatible — no behavior change unless an implementation opts in.
  • LocalPinotFS: implemented via RandomAccessFile + a bounded stream that releases the file handle on close.
  • GcsPinotFS: implemented via ReadChannel.seek/limit (a single ranged GET, truncated at EOF).

Motivation

Enables targeted reads of file regions (e.g. Parquet footers and column chunks) for engines/plugins that query columnar data in place, avoiding full-object transfers.

Testing

LocalPinotFSTest covers mid-range, from-start, EOF truncation, zero-length, whole-file, and invalid-argument cases (runs in CI). GcsPinotFSTest adds a ranged-read case (credentials-gated, consistent with the existing GCS integration test).

PR Tags

  1. feature
  2. performance

release-notes:

  • Signature changes to public methods/interfaces

Add additive default methods openForRead(uri, offset, length) and
supportsRangedRead() to the PinotFS SPI. Defaults throw / return false so
existing implementations are unaffected. Implement for LocalPinotFS (via
RandomAccessFile) and GcsPinotFS (via ReadChannel seek/limit), enabling
targeted reads of byte ranges (e.g. Parquet footers and column chunks)
without downloading whole objects.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
davecromberge added a commit to permutive-engineering/pinot that referenced this pull request Jun 26, 2026
Add additive default methods openForRead(uri, offset, length) and
supportsRangedRead() to the PinotFS SPI (default-throws / false, so existing
implementations are unaffected). Implemented for LocalPinotFS (RandomAccessFile)
and GcsPinotFS (ReadChannel seek/limit), enabling targeted byte-range reads
(e.g. Parquet footers and column chunks) without downloading whole objects.

Upstream PR: apache#18861 (open)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov-commenter

codecov-commenter commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 37.16%. Comparing base (a9b5207) to head (19eb9a1).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
.../org/apache/pinot/spi/filesystem/LocalPinotFS.java 0.00% 25 Missing ⚠️
.../java/org/apache/pinot/spi/filesystem/PinotFS.java 0.00% 2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (a9b5207) and HEAD (19eb9a1). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (a9b5207) HEAD (19eb9a1)
unittests1 1 0
unittests 2 1
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #18861       +/-   ##
=============================================
- Coverage     64.79%   37.16%   -27.64%     
+ Complexity     1322     1321        -1     
=============================================
  Files          3393     3393               
  Lines        211265   211292       +27     
  Branches      33212    33216        +4     
=============================================
- Hits         136896    78519    -58377     
- Misses        63320   125570    +62250     
+ Partials      11049     7203     -3846     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (?)
java-21 37.16% <0.00%> (-27.64%) ⬇️
temurin 37.16% <0.00%> (-27.64%) ⬇️
unittests 37.15% <0.00%> (-27.64%) ⬇️
unittests1 ?
unittests2 37.15% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants