[WIP][VL] Support GPU async native shuffle read#12370

Draft

marin-ma wants to merge 2 commits into

apache:mainfrom

marin-ma:gpu-async-native-shuffle-read

marin-ma commented Jun 25, 2026 •

edited

Loading

Contributor

The parallelism of gpu stages is limited by the GPU concurrency, but the shuffle read process, which includes data fetching, decompression and deserialisation, are still running on CPU. In this case we can parallelise these process to produce the output data asynchronously.

This PR adopts a producer–consumer design. Producer threads asynchronously perform shuffle reads, including data fetching, decompression, and deserialization, and produce decoded data. The consumer (the main thread) retrieves the prepared data as it becomes available and creates the corresponding device buffers.

github-actions Bot added CORE VELOX labels

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

marin-ma force-pushed the gpu-async-native-shuffle-read branch from e5cc3aa to 0713a26 Compare

June 25, 2026 17:05

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

marin-ma force-pushed the gpu-async-native-shuffle-read branch from 0713a26 to bebb887 Compare

June 25, 2026 17:11

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

marin-ma force-pushed the gpu-async-native-shuffle-read branch from bebb887 to 5ecc27c Compare

June 25, 2026 17:13

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

2 similar comments

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

marin-ma force-pushed the gpu-async-native-shuffle-read branch from f840d7a to c1d2691 Compare

June 25, 2026 18:09

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

2 similar comments

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

github-actions Bot commented Jun 25, 2026

Run Gluten Clickhouse CI on x86

marin-ma added 2 commits

June 26, 2026 09:29


          support async shuffle read

9c2997c

cpp

7c4a796

marin-ma force-pushed the gpu-async-native-shuffle-read branch from b716f71 to 7c4a796 Compare

June 26, 2026 08:34

github-actions Bot commented Jun 26, 2026

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels