Skip to content
Mini-OpenCV
v3.0.0 · CUDA 14 · C++17
High-performance CUDA image processing library achieving 30-50× speedup over CPU OpenCV. Supports 9+ operator categories with a three-layer architecture and clean C++17 API. Comprehensive GoogleTest suite and Google Benchmark included.
30-50× Speedup9+ OperatorsMIT License

Core Features

⚡ High Performance
CUDA kernel optimizations: shared memory tiling, atomic operations, warp-level primitives
🧠 Smart Memory
Zero-copy optimization minimizes host-device transfers, memory pool reuse reduces allocation overhead
🏗️ Three-Layer Architecture
Application → Operator → Infrastructure, clear separation of concerns
📊 9+ Operators
Convolution, morphology, geometric transforms, histogram, threshold, color space, filters, etc.
🧪 Well Tested
GoogleTest unit tests + Google Benchmark performance baselines ensure correctness and performance
📖 Bilingual Docs
Complete API reference, architecture guides, and tutorials in English and Chinese

Performance Comparison

OperationOpenCV CPUMini-OpenCV GPUSpeedup
Gaussian Blur (4K)45.2 ms1.2 ms37.7×
Sobel Edge (4K)38.1 ms0.9 ms42.3×
Bilateral Filter (4K)180.5 ms4.8 ms37.6×
Histogram Equalization (4K)12.3 ms0.3 ms41.0×

Tested on RTX 4090 vs Intel i9-13900K, 3840×2160 images

Quick Start

cpp
#include "gpu_image/gpu_image_processing.hpp"
using namespace gpu_image;

// Create processor and load image
ImageProcessor processor;
GpuImage gpu = processor.loadFromHost(hostImage);

// Apply operations (all GPU-accelerated)
GpuImage blurred = processor.gaussianBlur(gpu, 5, 1.5f);
GpuImage edges = processor.sobelEdgeDetection(gpu);

// Download result
HostImage result = processor.downloadImage(edges);

Learn More

Released under the MIT License.