Benchmark Methodology ===================== Fast-BOCPD treats performance as a first-class feature. The benchmarking suite under ``benchmarks/`` is designed to answer three questions: 1. *How fast is each observation model in both online and offline modes?* 2. *How does performance evolve as we optimize the code base?* 3. *How do we compare against other open-source BOCPD implementations?* This page summarizes the methodology documented in ``benchmarks/README.md``. Data Generation --------------- Synthetic datasets are produced by ``benchmarks/scripts/generate_data.py``. Each file encodes known changepoint locations so we can validate both accuracy and speed. Key parameters: * **Distribution** – ``gaussian``, ``student_t_fixed``, ``student_t_grid``, ``poisson``, ``gamma``, ``bernoulli``, ``binomial`` * **Expected run length** – λ = 150 observations between changepoints * **Sizes** – 1,000 / 10,000 / 100,000 observations Generated arrays are versioned under ``benchmarks/data/``. The benchmark runner checks for missing files and regenerates them automatically. Execution Protocol ------------------ All benchmark scripts (``benchmark_fast_bocpd.py``, ``benchmark_competitors.py``) follow the same protocol: 1. **Warm-up runs** – ``--warmup-runs`` (default 2) to prime instruction caches and avoid measuring import/initialization overhead. 2. **Timed runs** – ``--runs`` (default 10) executed back-to-back using ``time.perf_counter`` with microsecond precision. 3. **Aggregation** – We record the per-run median runtime, throughput, and coefficient of variation (CV%) to characterise both performance and stability. Benchmarks are compiled with ``-O3 -march=native -fomit-frame-pointer`` and are typically executed on a modern laptop/workstation. Results may vary, but relative differences remain consistent. Performance Metrics ------------------- Three metrics are reported everywhere (internal and competitor benchmarks): * **Median runtime (seconds)** – robust central tendency (less sensitive to outliers than mean). * **Throughput (obs/sec)** – number of observations processed per second; higher is better. * **Coefficient of variation (CV%)** – ``(std / mean) × 100`` indicating run-to-run stability. CV% < 1% means results are highly reproducible. Quick Start ----------- ``benchmarks/benchmark.sh`` orchestrates the entire suite: .. code-block:: bash # All Fast-BOCPD models ./benchmark.sh fbocpd # Specific model ./benchmark.sh gaussian # Competitors ./benchmark.sh competitors # Everything ./benchmark.sh . The script handles data generation, invokes the appropriate Python runner, and prints formatted summaries. Also see ``benchmarks/competitors/requirements.txt`` for installation instructions for third-party libraries. For more control, call the Python scripts directly: .. code-block:: bash # Fast-BOCPD cd benchmarks/scripts python benchmark_fast_bocpd.py --distribution gaussian --runs 20 --warmup-runs 3 python benchmark_fast_bocpd.py --distribution poisson --size 10000 # Competitors python benchmark_competitors.py --lib ruptures --runs 10 python benchmark_competitors.py --lib dtolpin --size 1000 Historical Tracking ------------------- Optimization progress is documented in ``benchmarks/Benchmark_tracking.md``. Every major iteration captures: * Raw benchmark outputs for all models * Compiler flags and environment notes * Narrative explaining observed regressions or improvements Refer to that log when evaluating long-term trends or validating that new optimizations maintain performance guarantees.