8#include <cuda_runtime.h>
Adaptive Data Mapping (ADM) stage — remaps u16/u32 integer streams into a compact 8-bit symbol domain...
rANS entropy coding stage (GPU, via vendored dietGPU kernel templates).
Bit-packing stage: packs N-bit integers into a dense byte stream.
GPU bit-matrix transpose stage (W × N bit shuffle over fixed-size chunks).
Pipeline builder and execution API.
Compression DAG wiring, execution, and memory strategy types.
First-order difference coding stage with optional negabinary fusion.
Huffman entropy coding stage with selectable encode mode.
Logging infrastructure and macros.
Fused Lorenzo predictor and quantizer stage.
Plain integer Lorenzo predictor (delta coding / prefix sum). Lossless.
Negabinary (base -2) integer encoding helpers.
Element-wise negabinary encode/decode stage (TIn[] ↔ TOut[]).
Direct-value quantizer stage with error-bounded coding and lossless outlier fallback.
Run-Length Encoding stage (lossless, stream-ordered).
Recursive Zero-byte Elimination stage — lossless byte-stream compressor.
Base class interface for all compression stages.
Reconstruction quality metrics (MSE, PSNR, max error, NRMSE).
Element-wise zigzag encode/decode stage (TIn[] ↔ TOut[]).