Skip to content

Memory Layout

Detailed memory layout of CSR and ELL sparse matrix formats.

CSR (Compressed Sparse Row)

Storage Structure

Original Matrix:     CSR Storage:
┌─────┬─────┐        values:     [ 1, 2, 3, 4, 5 ]
│ 1 0 2 │        col_indices: [ 0, 2, 1, 2, 3 ]
│ 0 3 4 │   =>   row_ptrs:    [ 0, 2, 4, 5 ]
│ 0 0 5 │       
└─────┴─────┘        row_ptrs[i] indicates the starting position of row i

Storage Cost: O(nnz + num_rows)

Array Description

ArraySizeDescription
valuesnnzNon-zero values, row-major order
col_indicesnnzColumn index for each non-zero
row_ptrsnum_rows + 1Starting offset for each row

Access Pattern

Non-zeros in row i:
  values[row_ptrs[i]] ... values[row_ptrs[i+1] - 1]
  col_indices[row_ptrs[i]] ... col_indices[row_ptrs[i+1] - 1]

Characteristics

  • General format, memory efficient
  • Suitable for irregular sparsity patterns
  • Efficient row traversal
  • May cause non-coalesced GPU memory access

ELL (ELLPACK)

Storage Structure

Original Matrix:     ELL Storage (column-major):
┌─────┬─────┐        values:     [ 1, 3, 5,   ← column 0
│ 1 0 2 │                      2, 4, 0 ]  ← column 1
│ 0 3 4 │   =>   col_indices: [ 0, 1, 3,   ← column 0
│ 0 0 5 │                      2, 2, - ]  ← column 1 (-1 = padding)
└─────┴─────┘        
                     Each row padded to max length

Storage Cost: O(num_rows × max_nnz_per_row)

Column-Major Storage

Characteristics

  • Column-major storage: GPU threads access consecutive memory, fully coalesced
  • No row pointers: Reduces one memory access
  • Aligned padding: Suitable for SIMD execution
  • Drawback: Wastes storage when row lengths vary significantly

CSR vs ELL Selection

FactorCSRELL
Storage efficiencyHigh (only non-zeros)Low (padding needed)
GPU memory accessMay be non-coalescedFully coalesced
Best forIrregular sparsityUniform row lengths
Conversion costO(nnz)

References

MIT License