feat: add Blackwell GPU (sm_120) CUDA support#401
feat: add Blackwell GPU (sm_120) CUDA support#401mvanhorn wants to merge 1 commit intojamiepine:mainfrom
Conversation
Set TORCH_CUDA_ARCH_LIST in the CUDA build step to include 12.0+PTX for forward compatibility with Blackwell GPUs (RTX 5070 Ti, 5080, etc). Pre-built PyTorch cu128 wheels only ship native kernels for sm_80/86/89/90. Without this, Blackwell GPU users get "no kernel image is available for execution on the device" at runtime. Fixes jamiepine#386 Related: jamiepine#395, jamiepine#396, jamiepine#399, jamiepine#400
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe GitHub Actions release workflow for Windows now explicitly specifies CUDA GPU architectures ( Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adds TORCH_CUDA_ARCH_LIST to the CUDA build step in release.yml to include 12.0+PTX for Blackwell GPU forward compatibility. 5 reports (#386 #395 #396 #399 #400) from RTX 50-series users hitting 'no kernel image' because pre-built PyTorch cu128 doesn't include sm_120. Fixes #386. This contribution was developed with AI assistance (Claude Code).
Summary by CodeRabbit