Table of Contents

Class BinomialLikelihoodCostFunction

Namespace
SignalSharp.CostFunctions.Cost
Assembly
SignalSharp.dll

Represents a cost function based on the Binomial negative log-likelihood for the PELT algorithm. This cost function is suitable for detecting changes in the success probability of binomial data (k successes out of n trials).

public class BinomialLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
Inheritance
BinomialLikelihoodCostFunction
Implements
Inherited Members

Remarks

This cost function assumes that the data within each segment follows a Binomial distribution, characterized by a number of trials (n) and a number of successes (k) for each data point, with a constant success probability (p) within the segment.

The input data is expected as a 2D array signalMatrix where:

  • signalMatrix[0, i] represents the number of successes (k_i) at time point i.
  • signalMatrix[1, i] represents the number of trials (n_i) at time point i.
It is required that 0 <= k_i <= n_i, n_i >= 1, and both k_i and n_i are effectively non-negative integers (within tolerance) for all i.

The cost for a segment [start, end) is derived from the maximized negative log-likelihood. Let K = Sum(k_i) and N = Sum(n_i) be the total successes and trials in the segment, respectively. The Maximum Likelihood Estimate (MLE) for the success probability p is p_hat = K / N.

The likelihood metric used for BIC/AIC calculations, proportional to the negative log-likelihood evaluated at p_hat, is: Metric(start, end) = -[ K * log(K) + (N - K) * log(N - K) - N * log(N) ] (derived from -[ K * log(p_hat) + (N - K) * log(1 - p_hat) ] and simplifying, dropping combinatorial terms). The convention 0 * log(0) = 0 is used via the internal XLogX helper. This metric is sensitive to changes in the underlying success rate p. A metric of 0 indicates a perfect fit (p_hat = 0 or p_hat = 1). The ComputeCost(int?, int?) method returns this same metric value.

This cost/metric is calculated efficiently using precomputed prefix sums of successes (k) and trials (n), allowing O(1) calculation per segment after an O(M) precomputation step during Fit(double[,]), where M is the number of data points.

Consider using the Binomial Likelihood cost function when:

  • Your data represents counts of successes out of a known number of trials at each time point.
  • You want to detect time points where the underlying success probability changes.
  • The assumption of a constant success probability within each segment is reasonable.

Constructors

BinomialLikelihoodCostFunction()

Initializes a new instance of the BinomialLikelihoodCostFunction class.

public BinomialLikelihoodCostFunction()

Properties

SupportsInformationCriteria

Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.

public bool SupportsInformationCriteria { get; }

Property Value

bool

Methods

ComputeCost(int?, int?)

Computes the cost for a segment [start, end) based on the Binomial negative log-likelihood. Cost = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ] where K=total successes, N=total trials.

public override double ComputeCost(int? start = null, int? end = null)

Parameters

start int?

The start index of the segment (inclusive). If null, defaults to 0.

end int?

The end index of the segment (exclusive). If null, defaults to the length of the data.

Returns

double

The computed cost for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).

Remarks

Calculates the cost in O(1) time using precomputed prefix sums. Correctly handles the edge cases where the estimated success probability is 0 or 1 (cost is 0). Must be called after Fit(double[,]). This returns the same value as ComputeLikelihoodMetric(int, int).

// Assuming 'binomialCost' is an instance fitted with data
double costSegment = binomialCost.ComputeCost(0, 10);

Exceptions

UninitializedDataException

Thrown when prefix sums are not initialized (Fit(double[,]) not called).

ArgumentOutOfRangeException

Thrown when the segment indices (start, end) are out of bounds.

SegmentLengthException

Thrown when the segment length (end - start) is less than 1.

ComputeLikelihoodMetric(int, int)

Computes the likelihood metric for a segment [start, end) based on the Binomial negative log-likelihood. Metric = -[ K*log(K) + (N-K)log(N-K) - Nlog(N) ] where K=total successes, N=total trials.

public double ComputeLikelihoodMetric(int start, int end)

Parameters

start int

The start index of the segment (inclusive).

end int

The end index of the segment (exclusive).

Returns

double

The computed likelihood metric for the segment (0 if p_hat=0 or p_hat=1, positive otherwise).

Remarks

Calculates the metric in O(1) time using precomputed prefix sums. Correctly handles the edge cases where the estimated success probability is 0 or 1 (metric is 0). Must be called after Fit(double[,]). This returns the same value as ComputeCost(int?, int?).

Exceptions

UninitializedDataException

Thrown when prefix sums are not initialized (Fit(double[,]) not called).

ArgumentOutOfRangeException

Thrown when the segment indices (start, end) are out of bounds.

SegmentLengthException

Thrown when the segment length (end - start) is less than 1.

Fit(double[,])

Fits the cost function to the provided binomial data by precomputing prefix sums.

public override IPELTCostFunction Fit(double[,] signalMatrix)

Parameters

signalMatrix double[,]

The data array to fit (rows=dimensions, columns=time points). Must have exactly 2 rows: row 0 for successes (k), row 1 for trials (n). Values for k and n must be non-negative integers (within tolerance), with n >= 1 and k <= n.

Returns

IPELTCostFunction

The fitted BinomialLikelihoodCostFunction instance.

Remarks

This method performs O(M) computation (where M is the number of time points) to calculate prefix sums, enabling O(1) cost/metric calculation per segment later. It must be called before cost computation. Input data must satisfy 0 <= k_i <= n_i, n_i >= 1, and be effectively integers (within tolerance).

// Data: k = [1, 2, 8, 9], n = [10, 10, 10, 10]
double[,] data = {
    { 1.0, 2.0, 8.0, 9.0 }, // Successes (k)
    { 10.0, 10.0, 10.0, 10.0 } // Trials (n)
};
var binomialCost = new BinomialLikelihoodCostFunction();
binomialCost.Fit(data);

// Data: k = [5, 8], n = [20, 15] (varying n)
double[,] dataVaryingN = {
    { 5.0, 8.0 }, // Successes (k)
    { 20.0, 15.0 } // Trials (n)
};
binomialCost.Fit(dataVaryingN);

Exceptions

ArgumentNullException

Thrown if signalMatrix is null.

ArgumentException

Thrown if signalMatrix does not have exactly 2 rows, or if data is invalid (k < 0, n < 1, k > n, not effectively integers, NaN, or Infinity).

Fit(double[])

Fitting with a 1D signal is not supported for the general Binomial likelihood cost function, as it requires both successes (k) and trials (n) information per point.

public IPELTCostFunction Fit(double[] signal)

Parameters

signal double[]

The one-dimensional time series data.

Returns

IPELTCostFunction

Throws NotSupportedException.

Exceptions

NotSupportedException

Always thrown, explaining the need for 2D input with specific row definitions.

GetSegmentParameterCount(int)

Gets the number of parameters estimated for a Binomial model segment. This is 1 parameter (the success probability 'p').

public int GetSegmentParameterCount(int segmentLength)

Parameters

segmentLength int

The length of the segment (unused).

Returns

int

Number of parameters: 1.