-
Notifications
You must be signed in to change notification settings - Fork 897
Pull requests: deepseek-ai/DeepGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: correct fused KV cache stride in paged MQA logits
#311
opened Apr 19, 2026 by
JasonOA888
Loading…
fix: correct operator precedence in pack_ue8m0_to_int assertion
#310
opened Apr 19, 2026 by
kuishou68
Loading…
change sm100_fp8_mqa_logits to 2cta, and change mma acc output to f16
#307
opened Apr 17, 2026 by
benzh-2025
Loading…
Fix JIT cache race condition with multi-process compilation
#302
opened Apr 11, 2026 by
Gregory-Pereira
Loading…
feat: support bf16 output and plain TMA writes in k_grouped_gemm on SM90;
#298
opened Mar 26, 2026 by
fedorovgv
Loading…
add caller location for util functions for better error message
#282
opened Jan 20, 2026 by
YouJiacheng
•
Draft
Add pyproject.toml for PEP 518 build system compliance
#271
opened Dec 31, 2025 by
yurekami
Contributor
Loading…
3 tasks
Add split-k optimization for sm90, reduce through DSMEM.
#186
opened Sep 5, 2025 by
Insideyyy
Loading…
[Feat] Single Batch Overlap (SBO): Overlaping of Down GEMM with Combine Send
#183
opened Sep 2, 2025 by
Sulfur6
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.