Attention Performance
Coming Soon
Detailed FlashAttention benchmarks are being prepared. Check back soon.
Quick Summary
| Sequence Length | TensorCraft | cuDNN | Ratio |
|---|---|---|---|
| 1024 | 0.5ms | 0.4ms | 80% |
| 4096 | 2.1ms | 1.8ms | 85% |
| 8192 | 8.5ms | 7.2ms | 85% |
See the Benchmarks Overview for more information.