Skip to content

[WIP][VL] Support GPU async native shuffle read#12370

Draft
marin-ma wants to merge 2 commits into
apache:mainfrom
marin-ma:gpu-async-native-shuffle-read
Draft

[WIP][VL] Support GPU async native shuffle read#12370
marin-ma wants to merge 2 commits into
apache:mainfrom
marin-ma:gpu-async-native-shuffle-read

Conversation

@marin-ma

@marin-ma marin-ma commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

The parallelism of gpu stages is limited by the GPU concurrency, but the shuffle read process, which includes data fetching, decompression and deserialisation, are still running on CPU. In this case we can parallelise these process to produce the output data asynchronously.

This PR adopts a producer–consumer design. Producer threads asynchronously perform shuffle reads, including data fetching, decompression, and deserialization, and produce decoded data. The consumer (the main thread) retrieves the prepared data as it becomes available and creates the corresponding device buffers.

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 25, 2026
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the gpu-async-native-shuffle-read branch from e5cc3aa to 0713a26 Compare June 25, 2026 17:05
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the gpu-async-native-shuffle-read branch from 0713a26 to bebb887 Compare June 25, 2026 17:11
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the gpu-async-native-shuffle-read branch from bebb887 to 5ecc27c Compare June 25, 2026 17:13
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

2 similar comments
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the gpu-async-native-shuffle-read branch from f840d7a to c1d2691 Compare June 25, 2026 18:09
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

2 similar comments
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the gpu-async-native-shuffle-read branch from b716f71 to 7c4a796 Compare June 26, 2026 08:34
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant