Skip to content

Attention Performance

Coming Soon

Detailed FlashAttention benchmarks are being prepared. Check back soon.

Quick Summary

Sequence LengthTensorCraftcuDNNRatio
10240.5ms0.4ms80%
40962.1ms1.8ms85%
81928.5ms7.2ms85%

See the Benchmarks Overview for more information.

Released under the Apache 2.0 License.