OpenSpec-driven CUDA FlashAttention reference implementation with FP32/FP16 support, forward and backward kernels, and a stable v0.3.0 baseline aimed at long-term handoff and archival quality.
| Resource | Link |
|---|---|
| Repository | LessUp/cuflash-attn |
| Releases | GitHub Releases |
| OpenSpec specs | openspec/specs/ |