Skip to content

[GLUTEN-12378][CORE] Return a defensive copy from ConsistentHash.getPartition()#12379

Draft
LuciferYang wants to merge 2 commits into
apache:mainfrom
LuciferYang:gluten-12378-getpartition-defensive-copy
Draft

[GLUTEN-12378][CORE] Return a defensive copy from ConsistentHash.getPartition()#12379
LuciferYang wants to merge 2 commits into
apache:mainfrom
LuciferYang:gluten-12378-getpartition-defensive-copy

Conversation

@LuciferYang

@LuciferYang LuciferYang commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

ConsistentHash is @ThreadSafe and its accessors return defensive copies — except getPartition(), which returned the internal set directly:

return nodes.get(node);

That set is the same instance held in the internal nodes map, so a caller could mutate the ring through it (e.g. getPartition(node).clear()), and the reference escaped after the read lock was released. This PR makes getPartition() a safe snapshot accessor:

  • Return a defensive copy of the set, matching getNodes().
  • Make Partition.setSlot() non-public. Copying the set alone still left the Partition elements shared and mutable; since setSlot() was public, a caller could change a slot and corrupt the ring (e.g. so removeNode() later drops the wrong entry). Only add() assigns slots during construction, so setSlot() is now private (reached as a nestmate).

This is latent today — getPartition() has no production caller and the internal set isn't mutated after a node is added — so there's no user-facing change; it's an encapsulation/consistency fix.

Fixes #12378.

How was this patch tested?

Added a unit test that adds a node, clears the set returned by getPartition(), and verifies the ring still reports all partitions. Making setSlot() non-public is enforced at compile time (no external caller exists). The existing ConsistentHashTest cases still pass.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Copilot AI review requested due to automatic review settings June 26, 2026 07:03
@github-actions github-actions Bot added the CORE works for Gluten Core label Jun 26, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens ConsistentHash’s encapsulation/thread-safety contract by ensuring getPartition() does not expose the internal mutable Set stored in the nodes map, aligning it with the existing defensive-copy behavior of other accessors (e.g., getNodes()).

Changes:

  • Update ConsistentHash.getPartition() to return a defensive copy of the partition set (or null if the node is absent).
  • Add a unit test that verifies mutating the returned set does not affect the ring’s internal state.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
gluten-core/src/main/java/org/apache/gluten/hash/ConsistentHash.java Returns a defensive copy from getPartition() instead of exposing the internal set.
gluten-core/src/test/java/org/apache/gluten/hash/ConsistentHashTest.java Adds a regression test ensuring callers can’t mutate the ring by clearing the returned partition set.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +165 to +168
// Return a defensive copy: the map value is internal mutable state, and getNodes() copies
// for the same reason. Callers get a snapshot they can't use to mutate the ring.
Set<Partition<T>> partitions = nodes.get(node);
return partitions == null ? null : new HashSet<>(partitions);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — addressed in 4d10483. setSlot() is now private (only add() assigns slots during construction, accessible as a nestmate), so callers can no longer mutate ring state through the partitions returned by getPartition(). I kept it to the minimal non-public change rather than restructuring construction for full Partition immutability, since that would touch add()/getPartitionKey() — which #12351 is already changing — and I'd rather not overlap the two PRs.

@LuciferYang LuciferYang marked this pull request as draft June 26, 2026 07:07
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

…artition()

getPartition() returned the internal partition set directly, so a caller
could mutate the ring's state through it and the reference escaped after the
read lock was released. Every other accessor (e.g. getNodes()) already
copies. Return a defensive copy for consistency and to keep the @threadsafe
contract intact. Add a test that mutating the returned set does not affect
the ring.
…n() is a safe snapshot

The defensive copy stops callers from mutating the returned set, but the
Partition elements were still shared and setSlot() was public, so a caller
could change a slot and corrupt the ring (e.g. removeNode() would then drop
the wrong entry). Only add() assigns slots during construction, so make
setSlot private (accessible as a nestmate).
@LuciferYang LuciferYang force-pushed the gluten-12378-getpartition-defensive-copy branch from 4d10483 to 12f31b9 Compare June 26, 2026 09:25
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ConsistentHash.getPartition() returns the internal partition set instead of a copy

2 participants