Skip to content

CuFlash-Attn

OpenSpec-driven CUDA FlashAttention reference implementation with FP32/FP16 support, forward and backward kernels, and a stable v0.3.0 baseline aimed at long-term handoff and archival quality.

What this site is for

  • explain the supported API and build surface
  • point to the canonical OpenSpec design and verification sources
  • provide a clean entry point for integration, review, and handoff work
ResourceLink
RepositoryLessUp/cuflash-attn
ReleasesGitHub Releases
OpenSpec specsopenspec/specs/

Stable v0.3.0 baseline • OpenSpec-driven CUDA FlashAttention reference.

Contributors