FZGPUModules 1.0
GPU-accelerated modular compression pipeline
Loading...
Searching...
No Matches
perf.h
Go to the documentation of this file.
1#pragma once
2
3#include <cstddef>
4#include <iosfwd>
5#include <string>
6#include <vector>
7
8namespace fz {
9
20 std::string name;
21 int level;
22
27
28 size_t input_bytes;
29 size_t output_bytes;
30
32 float throughput_gbs() const noexcept;
33};
34
41 int level = 0;
42 int parallelism = 0;
43 float elapsed_ms = 0.0f;
44};
45
74 size_t input_bytes;
75 size_t output_bytes;
76
77 std::vector<StageTimingResult> stages;
78 std::vector<LevelTimingResult> levels;
79
82 float throughput_gbs() const noexcept;
83
86 float pipeline_throughput_gbs() const noexcept;
87
89 void print(std::ostream& os) const;
90};
91
92} // namespace fz
Definition fzm_format.h:25
Definition perf.h:40
Definition perf.h:70
float throughput_gbs() const noexcept
bool is_compress
true = compress pass, false = decompress pass
Definition perf.h:71
size_t output_bytes
Bytes produced by the pipeline.
Definition perf.h:75
float dag_elapsed_ms
GPU compute time only — dag->execute() (ms)
Definition perf.h:73
std::vector< StageTimingResult > stages
Per-stage results in topological order.
Definition perf.h:77
float host_elapsed_ms
Total host-side wall time including setup (ms)
Definition perf.h:72
size_t input_bytes
Bytes fed into the pipeline.
Definition perf.h:74
std::vector< LevelTimingResult > levels
Per-level aggregates in level order.
Definition perf.h:78
Definition perf.h:19
size_t output_bytes
Total bytes across all output buffers.
Definition perf.h:29
std::string name
Stage name (e.g. "lorenzo", "rle")
Definition perf.h:20
float throughput_gbs() const noexcept
Input throughput in GB/s (input_bytes / elapsed_ms, not host time).
int level
DAG execution level (0 = source stages)
Definition perf.h:21
float elapsed_ms
Definition perf.h:26
size_t input_bytes
Total bytes across all input buffers.
Definition perf.h:28