Class BinomialLikelihoodCostFunction
- Namespace
- SignalSharp.CostFunctions.Cost
- Assembly
- SignalSharp.dll
Represents a cost function based on the Binomial negative log-likelihood for the PELT algorithm. This cost function is suitable for detecting changes in the success probability of binomial data (k successes out of n trials).
public class BinomialLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
- Inheritance
-
BinomialLikelihoodCostFunction
- Implements
- Inherited Members
Remarks
This cost function assumes that the data within each segment follows a Binomial distribution,
characterized by a number of trials (n) and a number of successes (k) for each data point,
with a constant success probability (p) within the segment.
The input data is expected as a 2D array signalMatrix where:
signalMatrix[0, i]represents the number of successes (k_i) at time pointi.signalMatrix[1, i]represents the number of trials (n_i) at time pointi.
0 <= k_i <= n_i, n_i >= 1, and both k_i and n_i are effectively non-negative integers (within tolerance) for all i.
The cost for a segment [start, end) is derived from the maximized negative log-likelihood.
Let K = Sum(k_i) and N = Sum(n_i) be the total successes and trials in the segment, respectively.
The Maximum Likelihood Estimate (MLE) for the success probability p is p_hat = K / N.
The likelihood metric used for BIC/AIC calculations, proportional to the negative log-likelihood evaluated at p_hat, is:
Metric(start, end) = -[ K * log(K) + (N - K) * log(N - K) - N * log(N) ]
(derived from -[ K * log(p_hat) + (N - K) * log(1 - p_hat) ] and simplifying, dropping combinatorial terms).
The convention 0 * log(0) = 0 is used via the internal XLogX helper. This metric is sensitive to changes in the underlying success rate p.
A metric of 0 indicates a perfect fit (p_hat = 0 or p_hat = 1).
The ComputeCost(int?, int?) method returns this same metric value.
This cost/metric is calculated efficiently using precomputed prefix sums of successes (k) and trials (n),
allowing O(1) calculation per segment after an O(M) precomputation step during Fit(double[,]), where M is the number of data points.
Consider using the Binomial Likelihood cost function when:
- Your data represents counts of successes out of a known number of trials at each time point.
- You want to detect time points where the underlying success probability changes.
- The assumption of a constant success probability within each segment is reasonable.
Constructors
BinomialLikelihoodCostFunction()
Initializes a new instance of the BinomialLikelihoodCostFunction class.
public BinomialLikelihoodCostFunction()
Properties
SupportsInformationCriteria
Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.
public bool SupportsInformationCriteria { get; }
Property Value
Methods
ComputeCost(int?, int?)
Computes the cost for a segment [start, end) based on the Binomial negative log-likelihood.
Cost = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ] where K=total successes, N=total trials.
public override double ComputeCost(int? start = null, int? end = null)
Parameters
startint?The start index of the segment (inclusive). If null, defaults to 0.
endint?The end index of the segment (exclusive). If null, defaults to the length of the data.
Returns
- double
The computed cost for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).
Remarks
Calculates the cost in O(1) time using precomputed prefix sums.
Correctly handles the edge cases where the estimated success probability is 0 or 1 (cost is 0).
Must be called after Fit(double[,]). This returns the same value as ComputeLikelihoodMetric(int, int).
// Assuming 'binomialCost' is an instance fitted with data
double costSegment = binomialCost.ComputeCost(0, 10);
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start,end) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start) is less than 1.
ComputeLikelihoodMetric(int, int)
Computes the likelihood metric for a segment [start, end) based on the Binomial negative log-likelihood.
Metric = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ] where K=total successes, N=total trials.
public double ComputeLikelihoodMetric(int start, int end)
Parameters
startintThe start index of the segment (inclusive).
endintThe end index of the segment (exclusive).
Returns
- double
The computed likelihood metric for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).
Remarks
Calculates the metric in O(1) time using precomputed prefix sums.
Correctly handles the edge cases where the estimated success probability is 0 or 1 (metric is 0).
Must be called after Fit(double[,]). This returns the same value as ComputeCost(int?, int?).
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start,end) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start) is less than 1.
Fit(double[,])
Fits the cost function to the provided binomial data by precomputing prefix sums.
public override IPELTCostFunction Fit(double[,] signalMatrix)
Parameters
signalMatrixdouble[,]The data array to fit (rows=dimensions, columns=time points). Must have exactly 2 rows: row 0 for successes (k), row 1 for trials (n). Values for k and n must be non-negative integers (within tolerance), with n >= 1 and k <= n.
Returns
- IPELTCostFunction
The fitted BinomialLikelihoodCostFunction instance.
Remarks
This method performs O(M) computation (where M is the number of time points) to calculate prefix sums,
enabling O(1) cost/metric calculation per segment later. It must be called before cost computation.
Input data must satisfy 0 <= k_i <= n_i, n_i >= 1, and be effectively integers (within tolerance).
// Data: k = [1, 2, 8, 9], n = [10, 10, 10, 10]
double[,] data = {
{ 1.0, 2.0, 8.0, 9.0 }, // Successes (k)
{ 10.0, 10.0, 10.0, 10.0 } // Trials (n)
};
var binomialCost = new BinomialLikelihoodCostFunction();
binomialCost.Fit(data);
// Data: k = [5, 8], n = [20, 15] (varying n)
double[,] dataVaryingN = {
{ 5.0, 8.0 }, // Successes (k)
{ 20.0, 15.0 } // Trials (n)
};
binomialCost.Fit(dataVaryingN);
Exceptions
- ArgumentNullException
Thrown if
signalMatrixis null.- ArgumentException
Thrown if
signalMatrixdoes not have exactly 2 rows, or if data is invalid (k < 0, n < 1, k > n, not effectively integers, NaN, or Infinity).
Fit(double[])
Fitting with a 1D signal is not supported for the general Binomial likelihood cost function, as it requires both successes (k) and trials (n) information per point.
public IPELTCostFunction Fit(double[] signal)
Parameters
signaldouble[]The one-dimensional time series data.
Returns
- IPELTCostFunction
Throws NotSupportedException.
Exceptions
- NotSupportedException
Always thrown, explaining the need for 2D input with specific row definitions.
GetSegmentParameterCount(int)
Gets the number of parameters estimated for a Binomial model segment. This is 1 parameter (the success probability 'p').
public int GetSegmentParameterCount(int segmentLength)
Parameters
segmentLengthintThe length of the segment (unused).
Returns
- int
Number of parameters: 1.