fix(warp_reduce): add explicit add_op overload to resolve CUB template ambiguity on CUDA 13.2+ by zbrad · Pull Request #3050 · NVIDIA/raft

zbrad · 2026-06-08T11:23:30Z

Summary

On CUDA 13.2 (SM 121, DGX Spark), IVF-PQ builds fail with an ambiguous template instantiation error. When CUB scan kernels call warpReduce(val, raft::add_op{}), both raft::warpReduce<T, ReduceLambda> and cub::detail::scan::warpReduce<Tp, ScanOpT&> match, producing a compile error.

Fix: Add an explicit non-template overload in cpp/include/raft/util/reduction.cuh:

template <typename T>
DI T warpReduce(T val, raft::add_op reduce_op)

The explicit overload is preferred by the compiler over the generic ReduceLambda overload, resolving the ambiguity without changing any existing behavior.

cpp/include/raft/util/reduction.cuh — explicit raft::add_op overload for warpReduce
cpp/tests/util/reduction.cu — regression test (WARP_REDUCE_WITH_ADD_OP) to prevent future regressions

Repro / context

Observed on DGX Spark (SM 121) with CUDA 13.2. The upstream CI does not test CUDA 13.2 / SM 121; the new test compiles and passes on all CUDA versions but will catch regressions if CUDA 13.2 support is added to CI.

Error seen without fix:

error: more than one instance of overloaded function "warpReduce" matches the argument list

Test plan

WARP_REDUCE_WITH_ADD_OP regression test added in cpp/tests/util/reduction.cu
Verified fix compiles and tests pass on CUDA 13.2 (SM 121, DGX Spark)
No changes to existing warpReduce behavior — explicit overload only activates for raft::add_op

🤖 Generated with Claude Code

…e ambiguity on CUDA 13.2 On CUDA 13.2 (SM 121, DGX Spark), IVF-PQ builds fail because both raft::warpReduce<T, ReduceLambda> and cub::detail::scan::warpReduce<Tp, ScanOpT&> match when called with raft::add_op{}, causing an ambiguous template instantiation. Add an explicit non-template overload DI T warpReduce(T val, raft::add_op reduce_op) in reduction.cuh. The explicit overload is preferred by the compiler, resolving ambiguity. Also added a regression test WARP_REDUCE_WITH_ADD_OP to prevent future regressions.

copy-pr-bot · 2026-06-08T11:23:33Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

achirkin

Thank you for reporting the issue! Could you please provide the reference/example where the bug is triggered? Normally, I'd assume the fix should be on the user side - just call the function with explicit namespace.

achirkin · 2026-06-08T13:27:12Z

+{
+  assert(gridDim.x == 1);
+  int th_val = input[threadIdx.x];
+  th_val     = raft::warpReduce(th_val, raft::add_op{});


Does this really cause an ambiguity without the extra overload in util/reduction.cuh? It's called here with raft namespace, so I doubt CUB overload is ever picked up here.

It showed up for me when trying to do a full source rebuild of cuvs on the dgx spark. When finally debugging my build failure, it traced down to raft, but only showed up when building for arm64.

Oh, I have the cuvs regression test for it that I'm submitting to cuvs, it's at cpp/tests/regression/warp_reduce_add_op.cu

divyegala · 2026-06-08T16:40:02Z

@zbrad can you provide steps to reproduce this bug? I have a DGX Spark, and I have been building CUDA 13.2 without any failures.

zbrad · 2026-06-08T19:43:29Z

some other folks had asked for more background, so I went back and re-created the original failure from compiling cuvs. I've attached the doc and the repro.
warp reduce ambiguity doc
repro

achirkin · 2026-06-09T07:00:52Z

Thanks for the reproducer documentation! I feel uneasy about both suggested workarounds:

raft extra overload: it doesn't protect us from someone else failing exactly the same way on another device operation.
cuvs swap to thrust::plus: because we have a best practice guideline to use the raft operations where possible.

Maybe we could make the raft's warpReduce template itself a bit more restrictive so it wouldn't get in the way of thrust+cub primitives?..

zbrad requested a review from a team as a code owner June 8, 2026 11:23

github-project-automation Bot added this to Unstructured Data Processing Jun 8, 2026

filter artifacts

23bb5ac

achirkin requested changes Jun 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(warp_reduce): add explicit add_op overload to resolve CUB template ambiguity on CUDA 13.2+#3050

fix(warp_reduce): add explicit add_op overload to resolve CUB template ambiguity on CUDA 13.2+#3050
zbrad wants to merge 2 commits into
NVIDIA:mainfrom
zbrad:fix/warp-reduce-cub-ambiguity

zbrad commented Jun 8, 2026

Uh oh!

copy-pr-bot Bot commented Jun 8, 2026

Uh oh!

achirkin left a comment •

edited

Loading

Uh oh!

achirkin Jun 8, 2026

Uh oh!

zbrad Jun 8, 2026

Uh oh!

zbrad Jun 8, 2026 •

edited

Loading

Uh oh!

divyegala commented Jun 8, 2026

Uh oh!

zbrad commented Jun 8, 2026

Uh oh!

achirkin commented Jun 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

zbrad commented Jun 8, 2026

Summary

Repro / context

Test plan

Uh oh!

copy-pr-bot Bot commented Jun 8, 2026

Uh oh!

achirkin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

achirkin Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

zbrad Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

zbrad Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

divyegala commented Jun 8, 2026

Uh oh!

zbrad commented Jun 8, 2026

Uh oh!

achirkin commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

achirkin left a comment •

edited

Loading

zbrad Jun 8, 2026 •

edited

Loading

achirkin commented Jun 9, 2026 •

edited

Loading