Competitor Comparison
To validate Fast-BOCPD’s claims we benchmark against other Bayesian changepoint libraries using the exact same datasets and methodology.
Competitor Overview
Library |
Language |
Mode Support |
Models |
Repository |
|---|---|---|---|---|
Fast-BOCPD |
C + Python |
Online + Offline |
7 conjugate priors |
|
dtolpin/bocd |
Pure Python |
Online |
Student-t |
|
ruptures |
Python/Cython/C |
Offline |
Gaussian (CostNormal) |
|
promised-ai/changepoint |
Rust + Python |
Online |
6 conjugate priors |
|
hildensia/bayesian_changepoint_detection |
PyTorch |
Online + Offline |
Student-t |
dtolpin/bocd (Pure Python)
Reference implementation of Adams & MacKay (2007); valuable for education, but limited by interpreter overhead.
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|
|---|---|---|---|---|---|
1k |
Online |
0.1660 |
6 |
023 |
0.3% |
10k |
Online |
6.1030 |
1 |
639 |
1.0% |
100k |
Online |
614.8018 |
163 |
0.6% |
40–165× slower than Fast-BOCPD (Student-t fixed).
Predictable scaling (O(n)) but untenable runtimes beyond ~10k points.
ruptures (Offline Gaussian)
Industry-standard offline segmentation library (dynamic programming plus Gaussian cost). Included to show how a highly optimized Cython/C system fares against our offline mode.
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|
|---|---|---|---|---|---|
1k |
Offline |
0.0441 |
22 |
676 |
4.8% |
10k |
Offline |
0.7634 |
13 |
099 |
0.6% |
100k |
Offline |
38.9962 |
2 |
564 |
1.0% |
3–11× slower than Fast-BOCPD (Gaussian offline).
Throughput drops sharply at scale (22k → 2.5k obs/s).
promised-ai/changepoint (Rust)
Rust implementation with PyO3 bindings. Supports NormalGamma, BetaBernoulli, and PoissonGamma priors.
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|
|---|---|---|---|---|---|
1k |
Gaussian |
0.0367 |
27 |
227 |
0.2% |
10k |
Gaussian |
1.4060 |
7 |
112 |
1.8% |
100k |
Gaussian |
109.2458 |
915 |
0.7% |
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|
|---|---|---|---|---|---|
1k |
Bernoulli |
0.0180 |
55 |
595 |
0.7% |
10k |
Bernoulli |
1.4018 |
7 |
134 |
0.3% |
100k |
Bernoulli |
83.3343 |
1 |
200 |
0.5% |
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|
|---|---|---|---|---|---|
1k |
Poisson |
0.0316 |
31 |
643 |
0.3% |
10k |
Poisson |
1.1186 |
8 |
940 |
1.4% |
100k |
Poisson |
86.6610 |
1 |
154 |
0.9% |
Excellent performance for n ≤ 1k.
Severe throughput collapse (≈20–30× slowdown) by 100k observations.
Fast-BOCPD remains 18–28× faster at large scale.
hildensia/bayesian_changepoint_detection (PyTorch)
PyTorch implementation intended for GPU acceleration. Maintains the full run-length distribution ⇒ O(n²) complexity.
CPU (n = 1k):
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|---|---|---|---|---|
1k |
Online |
58.1504 |
17 |
0.2% |
1k |
Offline |
340.0730 |
3 |
0.1% |
GPU (T4, n = 1k):
Size |
Mode |
Median (s) |
Throughput (obs/s) |
CV% |
|---|---|---|---|---|
1k |
Online |
317.39 |
3 |
0.3% |
1k |
Offline |
1808.03 |
1 |
0.5% |
1,500–12,000× slower than Fast-BOCPD (Student-t) even on GPU.
Scaling beyond 1k observations is impractical (estimated 169 hours for 100k).
Cross-Library Snapshot (100k Observations)
Library |
Throughput (obs/s) |
Relative to Fast-BOCPD |
||
|---|---|---|---|---|
Fast-BOCPD (Gaussian offline) |
25 |
952 |
1.0× |
|
promised-ai (Gaussian online) |
915 |
28.3× slower |
||
ruptures (offline) |
2 |
564 |
10.1× slower |
|
dtolpin (online) |
163 |
159× slower |
||
hildensia (online |
extrapolated) |
17 |
1 |
500× slower |
Key Takeaways
Implementation dominates language. Even though Rust/Cython/PyTorch are capable languages, algorithmic details (memory allocation, truncation, O(n) vs O(n²)) determine real-world throughput.
Fast-BOCPD maintains O(n) scaling across all models while sustaining 20k–35k obs/s at 100k observations.
Competitor sweet spots:
rupturesfor its rich offline ecosystem.promised-aiwhen already invested in Rust and working with ≤10k samples.dtolpinfor educational/reference purposes.hildensiafor experimental GPU research (not production).
When performance matters—production streaming, large offline batches, embedded deployments—Fast-BOCPD delivers 10–1,500× speedups while staying dependency-light and API-compatible with standard Python workflows.