Skip to content

Pull requests: ROCm/aiter

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Tune] Add qwen3.5-397B MXFP4 a16w16 GEMM tuning configs
#3974 opened Jun 28, 2026 by yichiche Contributor Loading…
2 of 3 tasks
[CK] Fix MoE 2-stage dispatch for non-128-divisible inter_dim
#3973 opened Jun 27, 2026 by jonahbernard Loading…
1 task done
Add gelu_tanh activation to no-quant CK 2-stage fused MoE
#3972 opened Jun 27, 2026 by jonahbernard Loading…
1 task done
[Triton] Add gluon attn reduce
#3971 opened Jun 27, 2026 by vgokhale Contributor Loading…
[Triton]Add gluon moe reduce
#3970 opened Jun 27, 2026 by vgokhale Contributor Loading…
[gfx1250][FlyDSL] Optimize a8w8 ptpc gemm kernel
#3969 opened Jun 27, 2026 by aoli26 Contributor Draft
1 task done
[Triton] [gfx12] Tunning of A8W8 blockscale GEMM
#3967 opened Jun 27, 2026 by k50112113 Contributor Loading…
[Kernel][Perf] split-K long-context decode for shuffled fp8 SWA path
#3962 opened Jun 26, 2026 by reger-men Loading…
3 tasks done
[Kernel][Triton] sliding-window decode over shuffled fp8 paged KV
#3959 opened Jun 26, 2026 by reger-men Loading…
2 of 3 tasks
bf16 asm mha: add mask=0 kernel
#3957 opened Jun 26, 2026 by tingchen988 Contributor Loading…
1 task
CI: avoid fp8 KV cache in Kimi vLLM gate ci:kimi Trigger Kimi-K2.5 downstream accuracy gates (vLLM+SGLang)
#3954 opened Jun 26, 2026 by gyohuangxin Member Loading…
[Perf]update moe tuned config
#3946 opened Jun 26, 2026 by lalala-sh Contributor Loading…
1 task
Dev/fly pa reduce jit build
#3944 opened Jun 26, 2026 by Bernard-Liu Contributor Loading…
1 task
Feat/flydsl mxfp4 gemm
#3941 opened Jun 26, 2026 by lizamd Loading…
2 of 4 tasks
[Triton] Add fused_gemm_a16w16_split_cat
#3940 opened Jun 25, 2026 by rbrugaro-amd Contributor Loading…
Map top-left map to bottom-right for self-attn
#3939 opened Jun 25, 2026 by Micky774 Contributor Loading…
1 task
gate custom all-reduce on XGMI topology
#3938 opened Jun 25, 2026 by skysnow2001 Contributor Loading…
1 task done
Gluon MXFP4 Fuse Reduce Quant
#3937 opened Jun 25, 2026 by amd-jrosas Loading…
1 task done
[DO NOT MERGE] [TESTING CI] ci:all
#3935 opened Jun 25, 2026 by Boss2002n Contributor Loading…
[test] test_topk_plain: parametrize sweep to fix collection-time OOM
#3934 opened Jun 25, 2026 by JohnQinAMD Contributor Loading…
1 task
edit aiter_opus_plus.h using opus api instead of asm code
#3932 opened Jun 25, 2026 by junhaha666 Contributor Loading…
1 task
fix(quick_all_reduce): make flag sync CUDA-graph-safe
#3928 opened Jun 25, 2026 by Jasen2201 Contributor Loading…
ProTip! Updated in the last three days: updated:>2026-06-25.