Adding New Models

Fast-BOCPD supports multiple conjugate observation models. Each model is encapsulated in its own .c and .h file and plugged into the BOCPD engine via a virtual table. This guide walks through the steps required to add another model (for example, a Negative-Binomial variant).

1. Design the Sufficient Statistics

Decide which minimal statistics you need to maintain per run length. Typical patterns include:

  • n (count), sum_x, sum_x2 for Gaussian-like models.

  • sum_k or sum_log_x for discrete count models.

Define the struct in a new header (e.g., fast_bocpd/_c/negative_binomial_gamma.h) and provide a matching typedef for the parameter struct.

2. Implement the Model in C

Each model must expose the following functions:

  • size_t <model>_stats_size(void);

  • void <model>_prior_stats(<Stats>* stats);

  • void <model>_update_stats(<Stats>* stats, const <Params>* params, double x);

  • double <model>_predictive_logpdf(const <Params>* params, const <Stats>* stats, double x);

  • void <model>_copy_stats(void* dst, const void* src);

Follow the established style: validate inputs aggressively, document edge cases, and keep the predictive calculation numerically stable (use lgamma, log1p, etc.). Refer to existing files such as poisson_gamma.c for patterns.

3. Register the Model with BOCPD

  • Add an enum entry to ObsModelType (bocpd_core.h).

  • Extend ObsModelParams and the copy_obs_params helper in bocpd_core.c.

  • Update init_obs_vtable so the new model’s functions are wired into the virtual table.

  • Add Python ctypes definitions in fast_bocpd/_bindings.py and expose the model through the high-level API (usually by extending fast_bocpd/models.py).

4. Write Tests

  • Add a focused C test in tests/c_tests to cover prior stats, update logic, predictive values, and validation.

  • Mirror that test coverage in tests/python/models and tests/python/integration so the Python surface area is also checked.

5. Update the Documentation

  • Document the new model in API Reference.

  • Mention any new parameters in the user guides (installation, tuning, interpreting results).

  • If the model requires special setup (e.g., grids, priors), describe it here so future contributors know how it fits into the architecture.

Tips and Best Practices

  • Keep statistics structs plain old data—no pointers or heap allocations. Their size should be small so copying between run lengths is cheap.

  • Use the BOCPD_DEBUG_CHECKS macro to guard invariants during development. Add #ifdef blocks if you need extra diagnostics that should stay out of release builds.

  • Benchmark early. Even small allocations inside predictive_logpdf can negate the benefits of the C implementation.

With these steps the new model will behave like any built-in option, benefiting from the same high-level Python API while leveraging the C runtime for speed.