C Backend Architecture
This page dives into fast_bocpd/_c—the performance-critical portion of
the project. If you plan to modify the BOCPD core, add a new observation
model, or debug numerical issues, start here.
Source Layout
bocpd_core.cOwns the BOCPD state machine, allocation logic, run-length recursion, and hazard dispatch.
*_model.cEach observation model (Gaussian-NIG, Student-t, Poisson-Gamma, Bernoulli-Beta, Binomial-Beta, Gamma-Gamma) implements a shared interface:
stats_size,prior_stats,update_stats,predictive_logpdf, andcopy_stats. The implementations live in separate files so they can be optimized independently.hazard.cCurrently only implements the constant hazard, but it is structured so new hazard families can be added later.
student_t_ng_grid.cA special case that manages a mixture over degree-of-freedom values. It keeps an aligned byte blob per run length containing log-weights and per-grid statistics.
Key Concepts
- Virtual Tables
bocpd_coreuses function pointers (ObsModelVTable) so that the hot loop does not needswitchstatements. Adding a new observation model simply means filling in another vtable and wiring it into theinit_obs_vtablefunction.- Aligned Statistics
Each run length owns a blob of bytes sized by
stats_size = round_up_align(model->stats_size, STATS_ALIGNMENT). We usemax_align_twhen available and fall back to 16 bytes to stay portable across Linux, macOS, and Windows.- Zero-Copy Buffer Swaps
bocpd_updatewrites intonew_log_joint/new_statswhile reading from the old buffers, then swaps the pointers. No data is memcpy’d for normal updates, which keeps cache pressure minimal.
Defensive Programming
The C layer validates everything:
bocpd_initrejects invalid hyperparameters once, up front.Observation models guard against
NaN, negative counts, or corrupted statistics.Optional
BOCPD_DEBUG_CHECKSzeroes buffers after reset/update to expose uninitialized reads during development.
If you ever see -INFINITY returned from bocpd_update, it usually
means an observation or stats buffer failed validation; check the relevant
model file.
Adding New Functionality
- Observation models
Implement the shared interface, add an enum value to
ObsModelType, updateObsModelParams/ctypesbindings, and register the model insideinit_obs_vtable. See Adding New Models for a detailed guide.- Hazard functions
Follow the pattern in
hazard.c: keep parameters in a dedicatedstruct, provide*_initpluslog_transition_cpandlog_transition_conthelpers, then extend the hazardswitchinbocpd_core.c.- Memory ownership
bocpd_initalways zeroes theBOCPDStatestruct before use andbocpd_freehandles every pointer (including optional grids). When you add new allocations, ensure they are set toNULLin the failure paths and freed insidebocpd_free.
Testing
Every model ships with focused C unit tests in tests/c_tests and
matching Python tests in tests/python. The C tests can be run via
make test; they exercise parameter validation, numerical stability,
and change-point recursion. Whenever you modify the C layer, run the tests
and consider enabling BOCPD_DEBUG_CHECKS to catch latent bugs.