Getting Started
This document helps you quickly set up the environment and run your first benchmark.
Requirements
| Component | Version | Check Command |
|---|---|---|
| CUDA Toolkit | 12.x | nvcc --version |
| CMake | 3.18+ | cmake --version |
| C++ Compiler | C++17 support | g++ --version |
| GPU | SM 7.0+ | nvidia-smi |
Step 1: Clone Repository
bash
git clone https://github.com/LessUp/mini-inference-engine.git
cd mini-inference-engineStep 2: Debug Build + Tests
Use the system GCC 12 / G++ 12 preset when your shell has Conda or another custom C++ toolchain active.
bash
# Configure Debug build with system GCC 12 / G++ 12
cmake --preset gcc-cuda
# Build
cmake --build --preset gcc-cuda
# Run tests
ctest --preset gcc-cudaExpected Output:
Test project /path/to/mini-inference-engine/build-gcc-cuda
Start 1: test_config
1/8 Test #1: test_config ..................... Passed 0.01 sec
Start 2: test_logger
2/8 Test #2: test_logger ..................... Passed 0.01 sec
...
8/8 Test #8: test_gemm ....................... Passed 0.52 sec
100% tests passed, 0 tests failed out of 8Step 3: Release Build + Benchmark
bash
# Configure Release build with system GCC 12 / G++ 12
cmake --preset release-gcc-cuda
# Build
cmake --build --preset release-gcc-cuda
# Run benchmark
./build-release-gcc-cuda/benchmarkExpected Output:
=== Mini-Inference Engine Benchmark ===
GPU: NVIDIA GeForce RTX 3080
Matrix size: 1024 x 1024
Kernel Time (ms) TFLOPS vs cuBLAS
----------------------------------------------------
Naive 15.23 0.14 10.2%
Tiled 7.61 0.28 20.4%
Coalesced 6.12 0.35 25.3%
Double Buffer 3.85 0.56 40.8%
Register Blocked 1.82 1.19 86.5%
Vectorized 1.71 1.25 91.2%
cuBLAS 1.56 1.37 100.0%Step 4: Run MNIST Demo
bash
./build-release/mnist_demoThis validates correctness using the inference engine on MNIST dataset.
Common Issues
Q: Cannot find CUDA?
Ensure CUDA_PATH environment variable is set correctly:
bash
export CUDA_PATH=/usr/local/cuda
export PATH=$CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATHQ: Tests are skipped?
If no NVIDIA GPU is available, GPU tests will automatically skip. This is expected behavior.
Q: Build errors?
- Check CUDA version compatibility
- Check C++ compiler supports C++17
- If Conda is active, prefer
cmake --preset gcc-cuda
Next Steps
- Architecture - Understand system design
- GEMM Optimization - Deep dive into optimization
- Learning Path - Systematic study plan