PELT Algorithm

The PELT (Pruned Exact Linear Time) algorithm is a powerful and efficient method for detecting multiple change points within time series data. It identifies the points in time where the statistical properties of the signal change significantly. PELT scales well with the size of the data due to its pruning technique, making it suitable for various applications.

How PELT Works

PELT uses dynamic programming to find the optimal segmentation of a time series. It aims to partition the data into segments where the statistical properties are homogeneous within each segment but differ between segments.

The algorithm minimizes a total cost, which is the sum of the costs for each segment plus a penalty for each change point introduced. The objective function to minimize is:

\(\text{TotalCost} = \min_{t_1, \dots, t_m} \sum_{i=0}^{m} \left( \mathcal{C}(y_{t_i+1 : t_{i+1}}) + \beta \right)\)

Where:

\(y\) is the time series data of length \(N\).
\(t_0 = 0 < t_1 < \dots < t_m < t_{m+1} = N\) are the change point indices. \(m\) is the total number of change points detected.
\(y_{a:b}\) represents the segment of the time series from index \(a\) up to (and including) index \(b-1\). Note: In the formula \(y_{t_i+1 : t_{i+1}}\), the indices refer to the data range \(y[t_i], \dots, y[t_{i+1}-1]\) in 0-based indexing, covering a length of \(t_{i+1} - t_i\). The ComputeCost method uses start (inclusive) and end (exclusive) indices, so ComputeCost(t_i, t_{i+1}) would compute the cost for this segment.
\(\mathcal{C}(y_{a:b})\) is the cost of the segment \(y_{a:b}\), calculated using a chosen cost function. This measures how well the segment fits a model (e.g., how close points are to the segment mean for L2 cost).
\(\beta\) is the penalty value applied for introducing each change point.

PELT efficiently finds the optimal set of change points \((t_1, \dots, t_m)\) by iteratively calculating the minimum cost to segment the series up to each time point \(t\), using a pruning step to keep the computation time close to linear in \(N\).

Configuration (PELTOptions)

The behavior of the PELT algorithm is controlled via the PELTOptions record:

CostFunction: An instance implementing IPELTCostFunction. This determines how the cost of each potential segment is calculated. The choice depends on the data characteristics (see Cost Functions below). Default: @" SignalSharp.CostFunctions.Cost.L2CostFunction?text=L2CostFunction".
MinSize: The minimum allowable number of data points within a segment. Ensures that segments are not too small. Must be greater than or equal to 1 (cost functions may impose larger minimums, e.g., @" SignalSharp.CostFunctions.Cost.ARCostFunction?text=ARCostFunction"). Default: 1.
Jump: Controls the step size when considering previous potential change points. Default: 1.
- If Jump = 1, PELT checks every possible prior change point, guaranteeing an exact solution (finding the true minimum of the objective function).
- If Jump > 1, PELT only considers previous change points at intervals of Jump. This significantly speeds up computation, especially for large datasets, but the solution becomes approximate. The quality of the approximation is generally good but not guaranteed to be optimal.

How Penalty (\(\beta\)) Affects the Algorithm

The penalty term (\(\beta\)) controls the trade-off between the goodness of fit (low segment costs) and the number of change points (model complexity):

High Penalty: Fewer change points are detected. The algorithm requires stronger evidence (larger changes in the data) to justify adding a segment boundary. This helps prevent overfitting and detects only major shifts.
Low Penalty: More change points are detected. The algorithm becomes more sensitive to smaller fluctuations in the data. This can be useful for capturing subtle changes but risks identifying noise as change points (overfitting).

Choosing the appropriate penalty value is crucial and often requires domain knowledge or experimentation (e.g., using methods like BIC/AIC, elbow plots, or cross-validation if applicable). See the Automatic Penalty Selection section below for methods to help automate this.

Cost Functions

The choice of cost function is critical as it defines what constitutes a "change" in the data. SignalSharp provides several options, each suited for different types of data characteristics and change patterns.

L1

The L1 cost function measures the sum of absolute deviations from the segment median:

\(\mathcal{C}(y_{a:b}) = \sum_{i=a}^{b-1} \sum_{d=0}^{D-1} | y_{d,i} - \text{median}(y_{d, a:b}) |\)

where \(D\) is the number of dimensions, \(y_{d,i}\) is the value at dimension \(d\) and time \(i\), and \(y_{d, a:b}\) represents the segment data for dimension \(d\).

This function detects changes in central tendency, specifically the median. It focuses on the bulk of the data and is less influenced by extreme values compared to L2. The L1 cost function is highly robust to outliers and non-Gaussian noise, making it suitable for time series with known or suspected outliers, spikes, or heavy-tailed noise distributions such as Laplace or Cauchy.

The current implementation precomputes medians for all possible segments, which has a high precomputation cost (roughly O(N² * D * log N)). The segment cost calculation then involves lookup and summation over the segment length (O( segment_length * D)).

L1 is ideal for financial data with jumps or sensor data with transient errors where the median provides a more stable representation of the segment's typical value than the mean. It supports multidimensional data and requires a minimum segment size of at least 1.

L2

The L2 cost function measures the sum of squared deviations from the segment mean:

\(\mathcal{C}(y_{a:b}) = \sum_{d=0}^{D-1} \sum_{i=a}^{b-1} ( y_{d,i} - \text{mean}(y_{d, a:b}) )^2\)

This is equivalent to minimizing the within-segment sum of squares (variance scaled by length), assuming variance is constant across segments. The L2 function is highly sensitive to changes in the mean, as well as to outliers since large deviations are squared. It performs optimally when data within segments is approximately normally distributed with constant variance.

The implementation uses prefix sums of the data and the squared data, calculated during the Fit phase (O(N * D) precomputation). This allows subsequent segment cost calculations in constant time per dimension (O(D)), making it computationally efficient.

L2 is best suited for relatively clean signals where changes in the average level are the primary focus, and the variance is assumed to be stable across different regimes. It works well for data that resembles Gaussian noise around potentially different means. L2 supports multidimensional data and requires a minimum segment size of at least 1.

Gaussian Likelihood

The Gaussian likelihood cost function measures the Gaussian negative log-likelihood. The cost/metric is primarily driven by the term:

\(\mathcal{C}(y_{a:b}) \approx \sum_{d=0}^{D-1} (b-a) \times \log(\hat{\sigma}^2_{d, a:b})\)

where \(\hat{\sigma}^2_{d, a:b}\) is the Maximum Likelihood Estimate (MLE) of the variance for the segment \(y_{a:b}\) in dimension \(d\). Constant terms related to \((b-a)\) and \(\pi\) are often omitted as they don't affect the location of change points when using a penalty or information criteria.

This function detects changes in both mean and variance. Changes in either will affect the estimated variance and thus the cost. It's sensitive to shifts in the mean (which affects the variance calculation relative to the segment mean) and directly sensitive to changes in the spread or volatility of the data. Like L2, it's sensitive to outliers and assumes data within segments follows a Gaussian distribution (independently per dimension).

The implementation is similar to L2, using prefix sums of data and squared data (O(N * D) precomputation). Segment cost/metric calculation is O(D). It includes a small epsilon floor for variance to handle segments with zero variance numerically.

This cost function is best for situations where the signal might exhibit changes in both its average level and its variability (volatility). It's commonly used in financial time series (market regimes), biological signals, or process control data where both setpoint and noise levels might change. It supports multidimensional data and requires a minimum segment size of at least 1. It implements ILikelihoodCostFunction and supports BIC/AIC/AICc penalty selection.

Poisson Likelihood

The Poisson likelihood cost function is based on the Poisson negative log-likelihood, suitable for count data. The cost/metric is derived from:

\(\mathcal{C}(y_{a:b}) = \sum_{d=0}^{D-1} 2 \left[ S_d - S_d \log(S_d) + S_d \log(n) \right]\)

where \(S_d = \sum_{i=a}^{b-1} y_{d,i}\) is the sum of counts in the segment for dimension \(d\), \(n = b-a\) is the segment length, and it assumes \(0 \log(0) = 0\). Constant terms like \(\sum \log(y!)\) are omitted.

This function detects changes in the underlying rate (\(\lambda_d\)) of events for each dimension. It assumes that counts within a segment follow a Poisson distribution with a constant rate, estimated by the sample mean \(\hat{\lambda}_d = S_d/n\). It is sensitive to changes in the average count level.

The implementation uses prefix sums of the data (O(N * D) precomputation). Segment cost/metric calculation is O(D). Input data must be non-negative counts (small negative values close to zero might be tolerated by clamping to 0).

This cost function is ideal for data representing counts of events per interval, such as website hits per day, defects per batch, or calls per hour, where you expect the average rate of events to change over time. It supports multidimensional data and requires a minimum segment size of at least 1. It implements ILikelihoodCostFunction and supports BIC/AIC/AICc penalty selection.

Bernoulli Likelihood

The Bernoulli likelihood cost function is derived from the Binomial negative log-likelihood for binary (0/1) data. The cost/metric is:

\(\mathcal{C}(y_{a:b}) = \sum_{d=0}^{D-1} -2 \left[ S_d \log(S_d) + (n-S_d) \log(n-S_d) - n \log(n) \right]\)

where \(S_d = \sum_{i=a}^{b-1} y_{d,i}\) is the number of successes (sum of 1s) in dimension \(d\), \(n = b-a\) is the segment length, and \(n-S_d\) is the number of failures (sum of 0s). It assumes \(0 \log(0) = 0\). Combinatorial terms are omitted.

This function detects changes in the probability of success (\(p_d\)) in a sequence of Bernoulli trials for each dimension. It assumes data within a segment consists of independent 0/1 outcomes with a constant success probability, estimated by the sample proportion \(\hat{p}_d = S_d/n\).

The implementation uses prefix sums of the data (O(N * D) precomputation). Segment cost/metric calculation is O(D). Input data must be strictly 0 or 1 (or numerically very close within a small tolerance).

This cost function is suited for binary time series data, such as machine status (up/down), test results (pass/fail), or presence/absence indicators, where the underlying probability of the '1' outcome is expected to change. It supports multidimensional data and requires a minimum segment size of at least 1. It implements ILikelihoodCostFunction and supports BIC/AIC/AICc penalty selection.

Binomial Likelihood

The Binomial likelihood cost function is used for data where each time point represents \(k\) successes out of \(n\) trials. The cost/metric function is:

\(\mathcal{C}(y_{a:b}) = - \left[ K \log(K) + (N-K) \log(N-K) - N \log(N) \right]\)

where \(K = \sum_{i=a}^{b-1} k_i\) is the total number of successes, \(N = \sum_{i=a}^{b-1} n_i\) is the total number of trials in the segment. It assumes \(0 \log(0) = 0\). Combinatorial terms are omitted.

This function detects changes in the underlying success probability (\(p\)) when the number of trials (\(n_i\)) might vary at each time point. It assumes data within a segment follows a Binomial distribution with a constant success probability, estimated by \(\hat{p} = K/N\).

The implementation requires a 2D input array where row 0 contains the successes (\(k_i\)) and row 1 contains the trials (\(n_i\)). It uses prefix sums for both \(k\) and \(n\) (O(N) precomputation, as D=2 is fixed). Segment cost/metric calculation is O( 1). Input \(k_i\) and \(n_i\) must be non-negative integers (within tolerance) with \(0 \le k_i \le n_i\) and \(n_i \ge 1\).

This cost function is ideal when your data consists of success counts out of a known (possibly varying) number of trials at each time point, such as the number of successful conversions per marketing campaign or the number of defective items per batch of varying size. It supports only this specific 2D input format and requires a minimum segment size of at least 1. It implements ILikelihoodCostFunction and supports BIC/AIC/AICc penalty selection.

RBF

The RBF cost function measures segment (in)homogeneity using a Radial Basis Function (RBF) kernel. The cost reflects how dissimilar points within the segment are from each other in the feature space induced by the kernel:

\(\mathcal{K}(x_i, x_j) = \exp(-\gamma \| x_i - x_j \|^2)\) \(\mathcal{C}(y_{a:b}) = \sum_{d=0}^{D-1} \left( (b-a) - \frac{1}{b-a} \sum_{i=a}^{b-1} \sum_{j=a}^{b-1} \mathcal{K}(y_{d,i}, y_{d,j}) \right)\)

A low cost indicates points are 'close' in the kernel space, suggesting homogeneity. This function can detect more complex patterns beyond simple mean/variance shifts, including potential changes in the shape of the data distribution, clustering, or non-linear dynamics. The exact nature depends heavily on the gamma parameter.

The sensitivity is highly dependent on the choice of the gamma parameter, which controls the locality of the kernel. A small gamma considers broader relationships, while a large gamma focuses on local density. If gamma is not provided, it's estimated using the median heuristic (based on pairwise distances).

The implementation requires computing a pairwise distance matrix and the corresponding kernel (Gram) matrix (O(N² * D)). Then, prefix sums are computed on the kernel matrix (O(N² * D)). Precomputation is computationally expensive, but segment cost calculation uses the prefix sums and is O(D).

RBF is best for exploratory analysis when the type of change is unknown, or when changes involve complex, non-linear shifts in data structure or distribution, such as identifying different phases in physical systems or changes in signal texture. It supports multidimensional data and requires a minimum segment size of at least 1. It does not support likelihood-based penalty selection.

Autoregressive

The AR cost function measures the goodness-of-fit of an Autoregressive (AR) model to the segment. The cost is the Residual Sum of Squares (RSS) after fitting the model:

\(y_t = c + \sum_{k=1}^{p} a_k y_{t-k} + \epsilon_t\) \(\mathcal{C}(y_{a:b}) = \sum_{t=a+p}^{b-1} \hat{\epsilon}_t^2\)

where \(p\) is the AR order, \(c\) is an optional intercept, \(a_k\) are the fitted coefficients, and \(\hat{\epsilon}_t\) are the residuals.

This function detects changes in the underlying linear dynamics or autocorrelation structure of the time series, as captured by the AR coefficients (\(a_k\)) and intercept (\(c\)). It's sensitive to changes in how past values predict the current value. Changes in periodicity, decay rates, or feedback mechanisms captured by the AR model will increase the cost if a single model is fit across the change.

The implementation fits an AR model using Ordinary Least Squares (OLS) via QR decomposition for each segment evaluation. It doesn't involve significant precomputation, but the segment cost calculation is computationally more intensive than other methods, roughly O(segment_length * order²).

The AR cost function is best for univariate time series where the phenomenon of interest is a change in the temporal dependencies or dynamics. Examples include detecting shifts in economic regimes, changes in process feedback control, or identifying different states in physiological signals based on their predictability. Currently, it supports univariate data only and requires a minimum segment length sufficient to estimate the AR parameters reliably: MinSize >= max(order + 1, 2 * order + k), where k=1 if an intercept is included, k=0 otherwise. It implements ILikelihoodCostFunction (calculating likelihood based on RSS and Gaussian error assumptions) and supports BIC/AIC/AICc penalty selection.

Choosing the Right Cost Function

Selecting the most appropriate cost function is crucial for obtaining meaningful results from PELT. Here are key considerations for your decision:

Data Type: What kind of data do you have?

Continuous measurements? (L1, L2, Gaussian, RBF, AR)
Counts of events? (Poisson)
Binary outcomes (0/1)? (Bernoulli)
Successes out of N trials? (Binomial)

Nature of Expected Change: Consider what type of change you're looking for. Are you interested in shifts in average level (mean/median)? Changes in volatility (variance)? Alterations in both? Changes in event rate or success probability? Changes in the underlying process dynamics (autocorrelation)? Or perhaps more complex pattern changes?

Data Characteristics: Evaluate your data's properties. Is it noisy? Does it contain outliers or spikes? Can the data within segments be reasonably approximated by a specific distribution (Normal, Poisson, Bernoulli/Binomial)? Is the data univariate or multivariate?

Computational Budget: Assess the size of your dataset and available computing resources. Can you afford O(N²) precomputation (L1, RBF) or per-segment OLS fitting (AR), or do you need the speed of O(N) precomputation and O(1) or O( D) segment costs (L2, Gaussian, Poisson, Bernoulli, Binomial)?

Need for Automatic Penalty Selection: If you plan to use BIC, AIC, or AICc for automatic penalty selection (see below), you must choose a cost function that implements ILikelihoodCostFunction and sets SupportsInformationCriteria to true (Gaussian, Poisson, Bernoulli, Binomial, AR).

The following table provides a comparison of the key features of each cost function:

Feature	L1	L2	Gaussian-Likelihood	Poisson-Likelihood	Bernoulli-Likelihood	Binomial-Likelihood	RBF	AR
Data Type	Continuous	Continuous	Continuous	Non-neg Counts	Binary (0/1)	Success/Trial Counts (k/n)	Continuous	Continuous
Detects Changes In	Median	Mean (assuming const. variance)	Mean AND Variance	Event Rate (\(\lambda\))	Success Probability (\(p\))	Success Probability (\(p\))	Distribution shape, complex patterns	Autocorrelation, dynamics
Robustness	Robust to outliers	Sensitive to outliers	Sensitive to outliers	Model assumptions	Model assumptions	Model assumptions	Moderate (depends on data/gamma)	Sensitive to model misspecification
Precomputation Cost	O(N²DlogN) (approx)	O(N*D) (fast)	O(N*D) (fast)	O(N*D) (fast)	O(N*D) (fast)	O(N) (fast)	O(N²*D) (slow)	O(1) (very fast)
Segment Cost/Metric	O(seg_len*D) (approx)	O(D) (very fast)	O(D) (very fast)	O(D) (very fast)	O(D) (very fast)	O(1) (very fast)	O(D) (fast, after precomp)	~O(seg_len*order²) (relatively slow)
Data Assumptions	Outliers okay, non-Gaussian	Approx. Normal, const. variance	Approx. Normal	Poisson distribution	Bernoulli trials	Binomial distribution	Non-linear patterns, distribution changes	Stationarity within segment (for AR model)
Dimensionality	Multi	Multi	Multi	Multi	Multi	Specific 2D (k/n)	Multi	Univariate ONLY
Supports BIC/AIC/AICc	No	No	Yes	Yes	Yes	Yes	No	Yes
Typical Use Case	Noisy signals, financial spikes	Clean signals, mean level shifts	Signals with varying volatility	Event counts, web traffic	Binary state changes	Conversion rates, defect rates	Regime shifts, complex system state changes	Economic time series, process control data
Minimum Segment Size	>= 1	>= 1	>= 1	>= 1	>= 1	>= 1	>= 1	`max(p+1, 2p+k)` where `p`=order, `k`=0/1

Practical Guide to Choosing a Cost Function

Here's how to approach choosing the right cost function for your specific needs:

Start with the data type:

If you have counts (website hits, error counts), use Poisson Likelihood.
If you have binary data (success/failure, on/off), use Bernoulli Likelihood.
If you have successes out of N trials (conversions/visitors), use Binomial Likelihood.

For continuous data:

For everyday change detection, start with L2. It's fast, intuitive, and works well on clean data with mean shifts. When in doubt, this is your go-to option. You'll get results quickly and can always try alternatives if needed.
When your data is messy with outliers, switch to L1. Those spikes in sensor data, network traffic anomalies, or financial flash crashes won't throw off your results. L1 focuses on the median, giving you more reliable change points in noisy real-world data.
For financial or environmental data where both the level and volatility matter, use Gaussian-Likelihood. It captures those periods where not just the average changes, but also how wildly the values fluctuate – perfect for detecting market regime shifts or climate pattern changes. It also supports automatic penalty selection.
Working with time series that show patterns or cycles? If you're analyzing a univariate series where the relationship between consecutive values matters more than their absolute levels, AR is your best bet. Remember to choose an order that matches your data's memory. It also supports automatic penalty selection.
When dealing with complex, non-linear changes that you can't quite define, RBF offers flexibility. It's computationally heavier but can detect subtle pattern shifts that other functions might miss.

Need automatic penalty selection? If you plan to use BIC, AIC, or AICc, ensure you pick Gaussian-Likelihood, Poisson-Likelihood, Bernoulli-Likelihood, Binomial-Likelihood, or AR.

Don't be afraid to experiment. The best approach often involves trying 2-3 candidate cost functions (appropriate for your data type) on a subset of your data and comparing the results. Remember that penalty values need adjusting for each cost function – what works for L2 won't necessarily work for a likelihood-based cost function due to different cost scales.

In my experience, I end up using L2 about 60% of the time for its speed and simplicity, Gaussian Likelihood 25% of the time (especially with financial data), and the others for specific cases. Let the characteristics of your data and what changes you care about guide your choice.

Usage Examples

Here are practical examples demonstrating how to use PELT:

(Note: The specific penalty values below are illustrative; optimal values depend on the data scale and desired sensitivity. For likelihood-based costs, consider using automatic penalty selection.)

Example 1: Basic Usage with L2 Cost (Default)

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// Sample signal with clear mean shifts
double[] signal = { 1, 1, 1, 5, 5, 5, 1, 1, 1 };

// Configure PELT (using default L2 cost, MinSize=1, Jump=1)
var options = new PELTOptions(); // Defaults: L2CostFunction, MinSize=1, Jump=1
var pelt = new PELTAlgorithm(options);

// Fit the algorithm to the data and detect change points with a penalty
// Using FitAndDetect convenience method
int[] changePoints = pelt.FitAndDetect(signal, penalty: 2.0);

Console.WriteLine("Change Points (L2 Cost, Default Opts): " + string.Join(", ", changePoints));
// Expected: [3, 6] (Indices are the first point *after* the change)

Example 2: Detecting Change Points in Network Traffic (L1 Cost Function)

Network traffic data might have spikes (outliers). L1 cost is more robust.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

double[] networkTraffic = {100, 102, 101, 500, 105, 98, 99, 300, 310, 100}; // Spikes at index 3 and shift starting at 7
var options = new PELTOptions
{
    CostFunction = new L1CostFunction(),
    MinSize = 2, // L1 often needs slightly larger min size for stable median
    Jump = 1 // Use exact PELT for potentially better outlier handling
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(networkTraffic, penalty: 50.0); // Penalty needs tuning based on data scale
Console.WriteLine("Change Points in Network Traffic (L1 Cost): " + string.Join(", ", changePoints));
// Example result (depends heavily on penalty): [3, 7] or similar

Example 3: Detecting Regime Shifts with Variance Change (Gaussian Likelihood Cost)

This cost function is ideal when both mean and variance might shift.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// Signal where first half has low variance, second half has high variance
double[] signalWithVarianceChange = { 0, 0.1, -0.1, 0, 0.1, 3.0, -2.0, 1.0, -3.0, 2.5 };
var options = new PELTOptions
{
    CostFunction = new GaussianLikelihoodCostFunction(),
    MinSize = 3 // Ensure enough points for variance estimation
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(signalWithVarianceChange, penalty: 3.0); // Penalty adjusted for likelihood scale
Console.WriteLine("Change Points (Gaussian Likelihood Cost): " + string.Join(", ", changePoints));
// Expected: Around index 5 where variance increases significantly

Example 4: Detecting Changes in Event Rate (Poisson Likelihood Cost)

Useful for count data like website hits or error logs.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// Number of errors per hour
double[] errorCounts = { 2, 1, 3, 2, 1, 0, 1, 8, 10, 9, 12, 7, 2, 1, 3 }; // Rate increases around index 7
var options = new PELTOptions
{
    CostFunction = new PoissonLikelihoodCostFunction(),
    MinSize = 3
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(errorCounts, penalty: 4.0); // Adjust penalty for likelihood scale
Console.WriteLine("Change Points in Error Counts (Poisson Cost): " + string.Join(", ", changePoints));
// Expected: Around index 7 where the average count increases, possibly another around index 12

Example 5: Detecting Changes in Binary State Probability (Bernoulli Likelihood Cost)

Suitable for 0/1 data like machine uptime/downtime.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// Machine status (1=up, 0=down)
double[] machineStatus = { 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0 }; // Change points around 4, 7, 12
var options = new PELTOptions
{
    CostFunction = new BernoulliLikelihoodCostFunction(),
    MinSize = 2
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(machineStatus, penalty: 1.5); // Adjust penalty
Console.WriteLine("Change Points in Machine Status (Bernoulli Cost): " + string.Join(", ", changePoints));
// Expected: Around indices 4, 7, 12

Example 6: Detecting Changes in Conversion Rate (Binomial Likelihood Cost)

When you have successes (k) out of trials (n) at each point.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// Conversion data: Row 0 = conversions (k), Row 1 = visitors (n)
double[,] conversionData = {
    { 5, 6, 7, 6,   15, 18, 16, 17,   8, 9, 7, 8 }, // Conversions (k)
    { 100, 105, 98, 102, 100, 95, 103, 101, 100, 105, 98, 102 } // Visitors (n)
}; // Rate increases around index 4, decreases around index 8

var options = new PELTOptions
{
    CostFunction = new BinomialLikelihoodCostFunction(),
    MinSize = 3
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(conversionData, penalty: 10.0); // Adjust penalty
Console.WriteLine("Change Points in Conversion Rate (Binomial Cost): " + string.Join(", ", changePoints));
// Expected: Around indices 4, 8

Example 7: Detecting Complex Patterns in Financial Data (RBF Cost Function)

Financial data can have complex patterns. RBF might capture non-linear shifts.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

double[] financialData = {100, 102, 101, 105, 110, 112, 108, 95, 90, 88}; // Gradual rise then fall
var options = new PELTOptions
{
    CostFunction = new RBFCostFunction(gamma: 0.1), // Specify gamma or let it be auto-calculated
    MinSize = 3
};
var pelt = new PELTAlgorithm(options);

pelt.Fit(financialData); // Separate Fit and Detect calls
int[] changePoints = pelt.Detect(penalty: 1.0); // RBF costs might be smaller scale
Console.WriteLine("Change Points in Financial Data (RBF Cost): " + string.Join(", ", changePoints));
// Example result (depends on gamma and penalty): Possibly around index 3 or 7

Example 8: Detecting Changes in Autocorrelation (AR Cost Function)

The AR cost function is useful when the underlying dynamics (like autocorrelation) change.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;
using System.Linq; // Required for Concat

// Helper function to generate AR(1) data (replace with actual generation if needed)
/*
public static double[] GenerateAR1Data(int n, double phi, double c, double sigma, int seed=42)
{
    var random = new Random(seed);
    var data = new double[n];
    data[0] = c / (1 - phi); // Start near expected mean
    for (int t = 1; t < n; t++)
    {
        data[t] = c + phi * data[t - 1] + random.NextGaussian(0, sigma); // Assuming NextGaussian exists
    }
    return data;
}
*/

// Generate AR(1) data with a change in coefficient
// Example placeholder using simpler manual data
double[] signalPart1 = { 1.0, 0.9, 0.85, 0.82, 0.81 }; // Segment 1: High positive correlation
double[] signalPart2 = { 0.2, -0.1, 0.05, -0.02, 0.01 }; // Segment 2: Lower/negative correlation
double[] arSignal = signalPart1.Concat(signalPart2).ToArray(); // Length 10

var options = new PELTOptions
{
    CostFunction = new ARCostFunction(order: 1, includeIntercept: true),
    MinSize = 3, // AR(1)+intercept requires min length 3 (check ARCostFunction docs)
    Jump = 1 // Use exact for potentially better results
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(arSignal, penalty: 0.05); // Penalty needs tuning based on RSS scale
Console.WriteLine("Change Points in AR Signal (AR Cost): " + string.Join(", ", changePoints));
// Expected: Around index 5 where the AR dynamics change

Example 9: Impact of Penalty on Sensor Data (L1 Cost)

Demonstrating how penalty changes sensitivity.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

double[] sensorData = {15.0, 15.2, 15.3, 20.0, 19.5, 20.1, 25.0, 24.5, 25.2};
var options = new PELTOptions { CostFunction = new L1CostFunction(), MinSize = 2 };
var pelt = new PELTAlgorithm(options);
pelt.Fit(sensorData);

// High Penalty: Less sensitive, detects major shifts
int[] changePointsHighPenalty = pelt.Detect(penalty: 8.0);
Console.WriteLine("Change Points with High Penalty (L1): " + string.Join(", ", changePointsHighPenalty));
// Expected: [3, 6] (Detecting the two major level shifts)

// Low Penalty: More sensitive, detects smaller shifts
int[] changePointsLowPenalty = pelt.Detect(penalty: 1.0);
Console.WriteLine("Change Points with Low Penalty (L1): " + string.Join(", ", changePointsLowPenalty));
// Expected: Potentially more points like [3, 4, 6, 7] if small dips/rises are considered significant with low penalty

Example 10: Detecting Change Points in Multidimensional Time Series Data (Gaussian Cost)

PELT supports multidimensional data. Gaussian cost handles mean/variance changes in multiple dimensions simultaneously.

using SignalSharp.Detection.PELT;
using SignalSharp.CostFunctions.Cost;
using System;

// 2 dimensions, 9 time points
double[,] multiSeriesSignal = {
    { 1.0, 1.1, 1.0,   5.0, 5.1, 4.9,   1.0, 1.1, 0.9 }, // Dimension 1: Mean shifts
    { 10.0, 10.1, 9.9, 10.0, 9.8, 10.1, 20.0, 19.8, 20.2 } // Dimension 2: Mean shifts at different points
}; // Combined change points expected around index 3 and 6

var options = new PELTOptions
{
    CostFunction = new GaussianLikelihoodCostFunction(), // Or L2CostFunction if variance assumed constant
    MinSize = 2
};
var pelt = new PELTAlgorithm(options);

int[] changePoints = pelt.FitAndDetect(multiSeriesSignal, penalty: 5.0); // Adjust penalty based on combined cost scale
Console.WriteLine("Change Points in Multidimensional Signal (Gaussian Cost): " + string.Join(", ", changePoints));
// Expected: [3, 6] (Reflecting the points where properties change in either dimension)

Automatic Penalty Selection

Choosing the right penalty (\(\beta\)) is often challenging. SignalSharp provides a PELTPenaltySelector to help automate this process using information criteria like BIC, AIC, or AICc.

How it Works

The penalty selector tests a range of penalty values. For each penalty, it runs the PELT algorithm to find the corresponding change points. Then, it calculates a score based on the chosen information criterion (e.g., BIC). The penalty value that results in the optimal (usually minimum) score is selected.

BIC (Bayesian Information Criterion): Tends to favor simpler models (fewer change points). Formula: -2*logL + P*log(N)
AIC (Akaike Information Criterion): May select slightly more complex models than BIC. Formula: -2*logL + 2*P
AICc (Corrected AIC): AIC with a correction for small sample sizes. Recommended when N is not much larger than P. Formula: AIC + (2*P*(P+1))/(N-P-1)