Competitor Comparison

To validate Fast-BOCPD’s claims we benchmark against other Bayesian changepoint libraries using the exact same datasets and methodology.

Competitor Overview

Library

Language

Mode Support

Models

Repository

Fast-BOCPD

C + Python

Online + Offline

7 conjugate priors

https://github.com/TiaanViviers/Fast_BOCPD

dtolpin/bocd

Pure Python

Online

Student-t

https://github.com/dtolpin/bocd

ruptures

Python/Cython/C

Offline

Gaussian (CostNormal)

https://github.com/deepcharles/ruptures

promised-ai/changepoint

Rust + Python

Online

6 conjugate priors

https://github.com/promised-ai/changepoint

hildensia/bayesian_changepoint_detection

PyTorch

Online + Offline

Student-t

https://github.com/hildensia/bayesian_changepoint_detection

dtolpin/bocd (Pure Python)

Reference implementation of Adams & MacKay (2007); valuable for education, but limited by interpreter overhead.

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Online

0.1660

6

023

0.3%

10k

Online

6.1030

1

639

1.0%

100k

Online

614.8018

163

0.6%

  • 40–165× slower than Fast-BOCPD (Student-t fixed).

  • Predictable scaling (O(n)) but untenable runtimes beyond ~10k points.

ruptures (Offline Gaussian)

Industry-standard offline segmentation library (dynamic programming plus Gaussian cost). Included to show how a highly optimized Cython/C system fares against our offline mode.

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Offline

0.0441

22

676

4.8%

10k

Offline

0.7634

13

099

0.6%

100k

Offline

38.9962

2

564

1.0%

  • 3–11× slower than Fast-BOCPD (Gaussian offline).

  • Throughput drops sharply at scale (22k → 2.5k obs/s).

promised-ai/changepoint (Rust)

Rust implementation with PyO3 bindings. Supports NormalGamma, BetaBernoulli, and PoissonGamma priors.

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Gaussian

0.0367

27

227

0.2%

10k

Gaussian

1.4060

7

112

1.8%

100k

Gaussian

109.2458

915

0.7%

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Bernoulli

0.0180

55

595

0.7%

10k

Bernoulli

1.4018

7

134

0.3%

100k

Bernoulli

83.3343

1

200

0.5%

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Poisson

0.0316

31

643

0.3%

10k

Poisson

1.1186

8

940

1.4%

100k

Poisson

86.6610

1

154

0.9%

  • Excellent performance for n ≤ 1k.

  • Severe throughput collapse (≈20–30× slowdown) by 100k observations.

  • Fast-BOCPD remains 18–28× faster at large scale.

hildensia/bayesian_changepoint_detection (PyTorch)

PyTorch implementation intended for GPU acceleration. Maintains the full run-length distribution ⇒ O(n²) complexity.

CPU (n = 1k):

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Online

58.1504

17

0.2%

1k

Offline

340.0730

3

0.1%

GPU (T4, n = 1k):

Size

Mode

Median (s)

Throughput (obs/s)

CV%

1k

Online

317.39

3

0.3%

1k

Offline

1808.03

1

0.5%

  • 1,500–12,000× slower than Fast-BOCPD (Student-t) even on GPU.

  • Scaling beyond 1k observations is impractical (estimated 169 hours for 100k).

Cross-Library Snapshot (100k Observations)

Library

Throughput (obs/s)

Relative to Fast-BOCPD

Fast-BOCPD (Gaussian offline)

25

952

1.0×

promised-ai (Gaussian online)

915

28.3× slower

ruptures (offline)

2

564

10.1× slower

dtolpin (online)

163

159× slower

hildensia (online

extrapolated)

17

1

500× slower

Key Takeaways

  • Implementation dominates language. Even though Rust/Cython/PyTorch are capable languages, algorithmic details (memory allocation, truncation, O(n) vs O(n²)) determine real-world throughput.

  • Fast-BOCPD maintains O(n) scaling across all models while sustaining 20k–35k obs/s at 100k observations.

  • Competitor sweet spots:

    • ruptures for its rich offline ecosystem.

    • promised-ai when already invested in Rust and working with ≤10k samples.

    • dtolpin for educational/reference purposes.

    • hildensia for experimental GPU research (not production).

  • When performance matters—production streaming, large offline batches, embedded deployments—Fast-BOCPD delivers 10–1,500× speedups while staying dependency-light and API-compatible with standard Python workflows.