-
Notifications
You must be signed in to change notification settings - Fork 27
Pull requests: ROCm/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add Tealite: pure-Python TransformerEngine for ROCm/AMD GPUs
#581
opened May 7, 2026 by
jayfurmanek
Contributor
Loading…
7 of 8 tasks
NVFP4: Work around intermittent incorrect results for backward GEMMs
ci-level 3
CI test level 3
#580
opened May 7, 2026 by
matthiasdiener
Contributor
Loading…
13 tasks
CK Tile MXFP8 Group GEMM gfx1250
ci-level 1
CI test level 1
#578
opened May 6, 2026 by
aris134
Contributor
Loading…
1 of 13 tasks
Update AITER CK dependency for gfx1250 grouped GEMM
ci-level 1
CI test level 1
#577
opened May 6, 2026 by
aris134
Contributor
Loading…
13 tasks
CK Tile Group GEMM gfx1250
ci-level 1
CI test level 1
#576
opened May 6, 2026 by
aris134
Contributor
Loading…
1 of 13 tasks
ck_tile grouped gemm: more padding
#574
opened May 5, 2026 by
matthiasdiener
Contributor
•
Draft
1 of 13 tasks
[ROCm] Allow bf16/bf16/fp32 in nvte_multi_tensor_gemm dispatcher
ci-level 1
CI test level 1
#573
opened May 4, 2026 by
lizamd
Loading…
13 tasks
gfx1250 swizzle_xor changes for FP4
ci-level 1
CI test level 1
#571
opened May 1, 2026 by
matthiasdiener
Contributor
Loading…
1 of 13 tasks
[No Merge][No Review] testing aiter auto trigger on gh action
ci-level 2
CI test level 2
#570
opened May 1, 2026 by
VeeraRajasekhar
Contributor
•
Draft
13 tasks
[proof-of-concept] add MXFP8 pre-swizzling for gfx1250
ci-level 1
CI test level 1
#568
opened Apr 29, 2026 by
matthiasdiener
Contributor
•
Draft
13 tasks
HipKittens MXFP8 GEMM Support
ci-level 1
CI test level 1
#566
opened Apr 28, 2026 by
alextmagro
Contributor
Loading…
Update QoLA reducing [compile time, kernel count, lib size] by ~2x (Diet QoLA)
ci-level 3
CI test level 3
#563
opened Apr 27, 2026 by
Micky774
Contributor
Loading…
1 of 13 tasks
Enable CI lint gh action on ROCm
ci-level 3
CI test level 3
#547
opened Apr 17, 2026 by
VeeraRajasekhar
Contributor
Loading…
13 tasks
CI: auto-trigger AITER prebuilt upload when 3rdparty/aiter updates on dev
#543
opened Apr 15, 2026 by
VeeraRajasekhar
Contributor
Loading…
8 of 13 tasks
[TE] Phase 2 of small-seq cross-attn integration: a separate cpp backend and a new jax api
ci-level 3
CI test level 3
#542
opened Apr 15, 2026 by
VeeraRajasekhar
Contributor
Loading…
13 tasks
Integrate AITER fused RoPE kernels with fallback to TE native
#541
opened Apr 15, 2026 by
suachong
Contributor
Loading…
7 tasks done
NV upstream release 2.12 merge
ci-level 3
CI test level 3
#538
opened Apr 13, 2026 by
Micky774
Contributor
Loading…
13 tasks
NVFP4: hadamard_transform_cast_fusion_columnwise
ci-level 1
CI test level 1
#515
opened Apr 1, 2026 by
matthiasdiener
Contributor
•
Draft
1 of 13 tasks
Add fsdp2 fp8 unit tests TE 2.10
ci-level 3
CI test level 3
#492
opened Mar 17, 2026 by
sudhu2k
Contributor
Loading…
8 of 13 tasks
Add AITER fused RoPE dispatch to FusedRoPEFunc
#489
opened Mar 17, 2026 by
sarthak-amd
Contributor
Loading…
ASV-format microbenchmark suite
#487
opened Mar 16, 2026 by
Micky774
Contributor
Loading…
1 of 13 tasks
Add Claude to review PRs
#480
opened Mar 13, 2026 by
wenchenvincent
Collaborator
Loading…
1 of 6 tasks
Microbenchmarking, CSV-based
#478
opened Mar 10, 2026 by
matthiasdiener
Contributor
•
Draft
4 of 16 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.