Getting Started

This document helps you quickly set up the environment and run your first benchmark.

Requirements

Component	Version	Check Command
CUDA Toolkit	12.x	`nvcc --version`
CMake	3.18+	`cmake --version`
C++ Compiler	C++17 support	`g++ --version`
GPU	SM 7.0+	`nvidia-smi`

Step 1: Clone Repository

bash

git clone https://github.com/LessUp/mini-inference-engine.git
cd mini-inference-engine

Step 2: Debug Build + Tests

Use the system GCC 12 / G++ 12 preset when your shell has Conda or another custom C++ toolchain active.

bash

# Configure Debug build with system GCC 12 / G++ 12
cmake --preset gcc-cuda

# Build
cmake --build --preset gcc-cuda

# Run tests
ctest --preset gcc-cuda

Expected Output:

Test project /path/to/mini-inference-engine/build-gcc-cuda
    Start 1: test_config
1/8 Test #1: test_config .....................   Passed    0.01 sec
    Start 2: test_logger
2/8 Test #2: test_logger .....................   Passed    0.01 sec
...
8/8 Test #8: test_gemm .......................   Passed    0.52 sec

100% tests passed, 0 tests failed out of 8

Step 3: Release Build + Benchmark

bash

# Configure Release build with system GCC 12 / G++ 12
cmake --preset release-gcc-cuda

# Build
cmake --build --preset release-gcc-cuda

# Run benchmark
./build-release-gcc-cuda/benchmark

Expected Output:

=== Mini-Inference Engine Benchmark ===
GPU: NVIDIA GeForce RTX 3080
Matrix size: 1024 x 1024

Kernel              Time (ms)   TFLOPS    vs cuBLAS
----------------------------------------------------
Naive               15.23       0.14      10.2%
Tiled               7.61        0.28      20.4%
Coalesced           6.12        0.35      25.3%
Double Buffer       3.85        0.56      40.8%
Register Blocked    1.82        1.19      86.5%
Vectorized          1.71        1.25      91.2%
cuBLAS              1.56        1.37      100.0%

Step 4: Run MNIST Demo

bash

./build-release/mnist_demo

This validates correctness using the inference engine on MNIST dataset.

Common Issues

Q: Cannot find CUDA?

Ensure CUDA_PATH environment variable is set correctly:

bash

export CUDA_PATH=/usr/local/cuda
export PATH=$CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATH

Q: Tests are skipped?

If no NVIDIA GPU is available, GPU tests will automatically skip. This is expected behavior.

Q: Build errors?

Check CUDA version compatibility
Check C++ compiler supports C++17
If Conda is active, prefer cmake --preset gcc-cuda

Next Steps

Architecture - Understand system design
GEMM Optimization - Deep dive into optimization
Learning Path - Systematic study plan

Getting Started ​

Requirements ​

Step 1: Clone Repository ​

Step 2: Debug Build + Tests ​

Step 3: Release Build + Benchmark ​

Step 4: Run MNIST Demo ​

Common Issues ​

Q: Cannot find CUDA? ​

Q: Tests are skipped? ​

Q: Build errors? ​

Next Steps ​

Getting Started

Requirements

Step 1: Clone Repository

Step 2: Debug Build + Tests

Step 3: Release Build + Benchmark

Step 4: Run MNIST Demo

Common Issues

Q: Cannot find CUDA?

Q: Tests are skipped?

Q: Build errors?

Next Steps