Tuning Parameters

Observation models expose statistical hyperparameters, while the hazard controls changepoint frequency. This guide explains how each knob maps to the mathematics in Conjugate Priors and how to pick reasonable defaults in practice.

General Strategy

Standardise your data (subtract mean, divide by scale) whenever possible. This keeps prior hyperparameters near order-of-one values.
Start with weak priors so the model learns primarily from data.
Tighten priors only after you understand typical regimes. Strong priors accelerate convergence but can mask real shifts if mis-specified.

Hazard Parameter (`lambda_`)

fast_bocpd.hazard.ConstantHazard uses a single number:

\[H = 1 / \lambda_.\]

lambda_=50 means “expect a changepoint roughly every 50 samples”.
Small lambda_ ⇒ aggressive reset behaviour.
Large lambda_ ⇒ long memory; use robust observation models to avoid spurious resets when outliers appear.

Observation Model Hyperparameters

GaussianNIG

Parameter	Controls	Heuristic
`mu0`	Prior mean	Set to historical average or 0 if data are centred
`kappa0`	Confidence in `mu0`	Use 1.0 for weak prior; scale up if you know the baseline
`alpha0`/`beta0`	Prior on variance	Start with 1.0; larger values lock variance near `beta0 / (alpha0 - 1)`

StudentTNG

Shares mu0, kappa0, alpha0, beta0 with GaussianNIG. The additional nu or nu grid controls tail heaviness:

nu between 2 and 5 covers most heavy-tailed scenarios.
Grid mode: choose 3–5 values spanning plausible tails (e.g., [2, 3, 5, 10, 20]) and default to a uniform nu_prior.
If your data rarely contain outliers, set nu ≥ 10 to approach the Gaussian limit while preserving some robustness.

PoissonGamma

Hyperparameters have a direct interpretation:

alpha0 – pseudo-count of events.
beta0 – pseudo-time (exposure).
Prior mean rate = alpha0 / beta0.

Guidelines:

If you expect ~5 events per interval, set alpha0=5, beta0=1.
For vague priors use alpha0=beta0=1.
Lower beta0 makes the prior rate higher; increasing both by the same factor keeps the mean but adds more “pseudo-observations”.

Bernoulli/Binomial Beta

alpha0 = prior successes + 1.
beta0 = prior failures + 1.

Examples:

Uniform prior (no preference) ⇒ alpha0 = beta0 = 1.
Historical conversion rate 20% with strong belief ⇒ alpha0=20, beta0=80 (behaves as if you observed 100 trials).

Binomial-Beta additionally requires n_trials (renamed N in the C layer). Keep it small (≤ 1,000) to avoid extreme binomial coefficients; if you aggregate more trials, consider rescaling or using multiple time steps.

Gamma-Gamma (Fixed Shape)

shape (k) – known shape of the Gamma likelihood. shape=1 reduces to the exponential distribution; higher values tighten the distribution.
alpha0 / beta0 – same interpretation as PoissonGamma but applied to the rate parameter of the Gamma likelihood.

Practical Workflow

Prototype with weak priors (mu0=0, kappa0=1, alpha0=beta0=1, lambda_=100). Inspect changepoint probabilities via BOCPD.update or OnlineChangeDetector.
Adjust hazard so changepoint probability matches intuition (e.g., if you expect weekly shifts in hourly data, lambda_=24*7).
Tune observation priors if the model is too sensitive (increase kappa0 or alpha0/beta0) or too sluggish (decrease them).
Validate by replaying historical data and checking that known events trigger spikes in P(r_t = 0).

For a deeper statistical explanation of each hyperparameter, refer back to Conjugate Priors.