GPU SpMV: Read the project as an engineering artifact

Why this project deserves a whitepaper

SpMV is a classic memory-bandwidth-bound workload, so performance depends more on access patterns than raw arithmetic throughput.
The interesting part is not only which kernel exists, but why it is chosen, when it is chosen, and how that choice is justified.
This project combines CUDA performance work with RAII resource management, explicit error handling, and readable documentation.

Why the problem matters and where the real bottlenecks are.
What each optimized kernel and the selector are responsible for.
How performance, engineering discipline, and explainability are tied together.
Where to continue reading for architecture, API usage, performance interpretation, and references.

Page	Role
Design Philosophy	See the architectural priorities and trade-offs
Performance Analysis	Learn how to interpret the benchmark evidence
Architecture Overview	Understand the execution pipeline and module boundaries
API Reference	Inspect the external interface
References	Review papers, projects, and further reading