Skip to content

Evidence

This section is the proof surface for TensorCraft-HPC.

Performance claims are paired with method, caveats, and source pages.Method Benchmark summaries, methodology notes, and cross-links to references.Source Benchmarks, whitepaper, and references routes.
TensorCraft-HPC performance evidence chart

Performance summary

Performance Benchmarks

Relative performance compared to NVIDIA libraries on A100 80GB (FP16 Tensor Core)

GEMM (FP16)vs cuBLAS
Tensor Core enabled
92%
100%
FlashAttentionvs cuDNN
Memory-efficient tiling
85%
100%
LayerNormvs cuDNN
Fused kernel
95%
100%
Conv2Dvs cuDNN
Im2Col optimization
78%
100%
SpMV (CSR)vs cuSPARSE
CSR format
88%
100%
88%Average
95%Best
5Kernels
📊Benchmarks run on A100 80GB, CUDA 12.4, Tensor Core enabled

What belongs here

Released under the MIT License.