Skip to content

Architecture Overview

Mini-OpenCV uses a three-layer architecture designed for performance, modularity, and ease of use.

Three-Layer Design

Layer Responsibilities

1. Application Layer

The top-level API that users interact with:

ComponentPurpose
ImageProcessorMain entry point for image operations
PipelineProcessorChain multiple operations with async execution

2. Operator Layer

CUDA kernels implementing image processing algorithms:

CategoryOperationsCUDA Technique
PixelInvert, grayscale, brightnessPer-pixel parallelism
ConvolutionGaussian blur, Sobel, custom kernelsShared memory tiling
HistogramCalculation, equalizationAtomic operations + reduction
GeometricResize, rotate, flip, affineBilinear interpolation
MorphologyErosion, dilation, open/closeCustom structuring elements
ThresholdGlobal, adaptive, OtsuHistogram-driven
Color SpaceRGB/HSV/YUV conversionMatrix operations
FiltersMedian, bilateral, sharpenEdge-preserving filters

3. Infrastructure Layer

Core utilities for GPU computing:

ComponentPurpose
DeviceBufferRAII GPU memory management
GpuImageImage container with GPU memory
CudaErrorError handling and checking
ImageIOImage file I/O (JPEG, PNG, BMP)
StreamManagerCUDA stream pool for async execution

Data Flow

Memory Model

Zero-Copy Optimization

Key optimizations:

  1. Lazy Allocation: Memory allocated on first use
  2. Buffer Reuse: Memory pool for temporary buffers
  3. Async Transfer: Overlap compute and transfer using CUDA streams

CUDA Stream Pipeline

Multi-stream execution enables overlapping operations:

Supported GPU Architectures

ArchitectureCompute CapabilityExample GPUs
TuringSM 75RTX 20 series, T4
AmpereSM 80/86A100, RTX 30 series
Ada LovelaceSM 89RTX 40 series, L4
HopperSM 90H100

Next Steps

Released under the MIT License.