Class BinomialLikelihoodCostFunction
- Namespace
- SignalSharp.CostFunctions.Cost
- Assembly
- SignalSharp.dll
Represents a cost function based on the Binomial negative log-likelihood for the PELT algorithm. This cost function is suitable for detecting changes in the success probability of binomial data (k successes out of n trials).
public class BinomialLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
- Inheritance
-
BinomialLikelihoodCostFunction
- Implements
- Inherited Members
Remarks
This cost function assumes that the data within each segment follows a Binomial distribution,
characterized by a number of trials (n
) and a number of successes (k
) for each data point,
with a constant success probability (p
) within the segment.
The input data is expected as a 2D array signalMatrix
where:
signalMatrix[0, i]
represents the number of successes (k_i
) at time pointi
.signalMatrix[1, i]
represents the number of trials (n_i
) at time pointi
.
0 <= k_i <= n_i
, n_i >= 1
, and both k_i
and n_i
are effectively non-negative integers (within tolerance) for all i
.
The cost for a segment [start, end)
is derived from the maximized negative log-likelihood.
Let K = Sum(k_i)
and N = Sum(n_i)
be the total successes and trials in the segment, respectively.
The Maximum Likelihood Estimate (MLE) for the success probability p
is p_hat = K / N
.
The likelihood metric used for BIC/AIC calculations, proportional to the negative log-likelihood evaluated at p_hat
, is:
Metric(start, end) = -[ K * log(K) + (N - K) * log(N - K) - N * log(N) ]
(derived from -[ K * log(p_hat) + (N - K) * log(1 - p_hat) ]
and simplifying, dropping combinatorial terms).
The convention 0 * log(0) = 0
is used via the internal XLogX
helper. This metric is sensitive to changes in the underlying success rate p
.
A metric of 0 indicates a perfect fit (p_hat = 0
or p_hat = 1
).
The ComputeCost(int?, int?) method returns this same metric value.
This cost/metric is calculated efficiently using precomputed prefix sums of successes (k
) and trials (n
),
allowing O(1)
calculation per segment after an O(M)
precomputation step during Fit(double[,]), where M is the number of data points.
Consider using the Binomial Likelihood cost function when:
- Your data represents counts of successes out of a known number of trials at each time point.
- You want to detect time points where the underlying success probability changes.
- The assumption of a constant success probability within each segment is reasonable.
Constructors
BinomialLikelihoodCostFunction()
Initializes a new instance of the BinomialLikelihoodCostFunction class.
public BinomialLikelihoodCostFunction()
Properties
SupportsInformationCriteria
Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.
public bool SupportsInformationCriteria { get; }
Property Value
Methods
ComputeCost(int?, int?)
Computes the cost for a segment [start, end)
based on the Binomial negative log-likelihood.
Cost = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ]
where K=total successes, N=total trials.
public override double ComputeCost(int? start = null, int? end = null)
Parameters
start
int?The start index of the segment (inclusive). If null, defaults to 0.
end
int?The end index of the segment (exclusive). If null, defaults to the length of the data.
Returns
- double
The computed cost for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).
Remarks
Calculates the cost in O(1)
time using precomputed prefix sums.
Correctly handles the edge cases where the estimated success probability is 0 or 1 (cost is 0).
Must be called after Fit(double[,]). This returns the same value as ComputeLikelihoodMetric(int, int).
// Assuming 'binomialCost' is an instance fitted with data
double costSegment = binomialCost.ComputeCost(0, 10);
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start
,end
) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start
) is less than 1.
ComputeLikelihoodMetric(int, int)
Computes the likelihood metric for a segment [start, end)
based on the Binomial negative log-likelihood.
Metric = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ]
where K=total successes, N=total trials.
public double ComputeLikelihoodMetric(int start, int end)
Parameters
start
intThe start index of the segment (inclusive).
end
intThe end index of the segment (exclusive).
Returns
- double
The computed likelihood metric for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).
Remarks
Calculates the metric in O(1)
time using precomputed prefix sums.
Correctly handles the edge cases where the estimated success probability is 0 or 1 (metric is 0).
Must be called after Fit(double[,]). This returns the same value as ComputeCost(int?, int?).
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start
,end
) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start
) is less than 1.
Fit(double[,])
Fits the cost function to the provided binomial data by precomputing prefix sums.
public override IPELTCostFunction Fit(double[,] signalMatrix)
Parameters
signalMatrix
double[,]The data array to fit (rows=dimensions, columns=time points). Must have exactly 2 rows: row 0 for successes (k), row 1 for trials (n). Values for k and n must be non-negative integers (within tolerance), with n >= 1 and k <= n.
Returns
- IPELTCostFunction
The fitted BinomialLikelihoodCostFunction instance.
Remarks
This method performs O(M)
computation (where M is the number of time points) to calculate prefix sums,
enabling O(1)
cost/metric calculation per segment later. It must be called before cost computation.
Input data must satisfy 0 <= k_i <= n_i
, n_i >= 1
, and be effectively integers (within tolerance).
// Data: k = [1, 2, 8, 9], n = [10, 10, 10, 10]
double[,] data = {
{ 1.0, 2.0, 8.0, 9.0 }, // Successes (k)
{ 10.0, 10.0, 10.0, 10.0 } // Trials (n)
};
var binomialCost = new BinomialLikelihoodCostFunction();
binomialCost.Fit(data);
// Data: k = [5, 8], n = [20, 15] (varying n)
double[,] dataVaryingN = {
{ 5.0, 8.0 }, // Successes (k)
{ 20.0, 15.0 } // Trials (n)
};
binomialCost.Fit(dataVaryingN);
Exceptions
- ArgumentNullException
Thrown if
signalMatrix
is null.- ArgumentException
Thrown if
signalMatrix
does not have exactly 2 rows, or if data is invalid (k < 0, n < 1, k > n, not effectively integers, NaN, or Infinity).
Fit(double[])
Fitting with a 1D signal is not supported for the general Binomial likelihood cost function, as it requires both successes (k) and trials (n) information per point.
public IPELTCostFunction Fit(double[] signal)
Parameters
signal
double[]The one-dimensional time series data.
Returns
- IPELTCostFunction
Throws NotSupportedException.
Exceptions
- NotSupportedException
Always thrown, explaining the need for 2D input with specific row definitions.
GetSegmentParameterCount(int)
Gets the number of parameters estimated for a Binomial model segment. This is 1 parameter (the success probability 'p').
public int GetSegmentParameterCount(int segmentLength)
Parameters
segmentLength
intThe length of the segment (unused).
Returns
- int
Number of parameters: 1.