Class BernoulliLikelihoodCostFunction
- Namespace
- SignalSharp.CostFunctions.Cost
- Assembly
- SignalSharp.dll
Represents a cost function based on the Binomial (specifically Bernoulli) negative log-likelihood for the PELT algorithm. This cost function is sensitive to changes in the probability of success in binary (0/1) data.
public class BernoulliLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
- Inheritance
-
BernoulliLikelihoodCostFunction
- Implements
- Inherited Members
Remarks
This cost function assumes that the data within each segment represents a sequence of independent Bernoulli trials
(outcomes are 0 or 1, e.g., failure/success) with a constant probability of success (p
) for that segment.
It calculates the cost based on the negative log-likelihood of the segment data given its estimated success probability (Maximum Likelihood Estimate - MLE).
The MLE for the success probability p
in a segment [start, end) of length n = end - start
is the sample proportion of successes:
p_hat = (Sum_{i=start}^{end-1} signal[i]) / n = S / n
, where S
is the number of successes (sum of 1s).
The likelihood metric used for BIC/AIC calculations, typically proportional to -2 * log-likelihood
, is calculated as:
Metric(start, end) = -2 * [ S * log(S) + (n-S) * log(n-S) - n * log(n) ]
where S
is the number of successes (sum of 1s), n
is the segment length, and n-S
is the number of failures (sum of 0s).
This formula uses the convention 0 * log(0) = 0
, which is handled explicitly in the implementation for the edge cases where S=0 (all failures) or S=n (all successes). In these edge cases, the metric is 0, representing a perfect fit.
The ComputeCost(int?, int?) method returns this same metric value.
This cost/metric is calculated efficiently using precomputed prefix sums of the signal (sum of 1s),
allowing O(D)
calculation per segment after an O(N*D)
precomputation step during Fit(double[,]), where D is the number of dimensions and N is the number of time points.
Consider using the Bernoulli Likelihood cost function when:
- Your data consists of binary outcomes (0s and 1s) over time (e.g., machine status up/down, test pass/fail, presence/absence).
- You expect changes in the underlying probability of the '1' outcome.
- The data within segments can be reasonably approximated by a sequence of independent Bernoulli trials with a constant success probability.
- The input data must contain only values numerically close to 0 or 1.
Note: This implementation requires input data to be strictly 0 or 1 (or numerically very close, within the default double-precision tolerance). Other values will cause an exception during Fit(double[,]).
Constructors
BernoulliLikelihoodCostFunction()
Initializes a new instance of the BernoulliLikelihoodCostFunction class.
public BernoulliLikelihoodCostFunction()
Properties
SupportsInformationCriteria
Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.
public bool SupportsInformationCriteria { get; }
Property Value
Methods
ComputeCost(int?, int?)
Computes the cost for a segment [start, end) based on the Bernoulli negative log-likelihood.
Cost = -2 * [ S*log(S) + (n-S)log(n-S) - nlog(n) ]
where S
= successes, n
= length.
public override double ComputeCost(int? start = null, int? end = null)
Parameters
start
int?The start index of the segment (inclusive). If null, defaults to 0.
end
int?The end index of the segment (exclusive). If null, defaults to the length of the data.
Returns
- double
The computed cost for the segment (0 if all successes or all failures, positive otherwise).
Remarks
Calculates the cost in O(D)
time using precomputed prefix sums, where D is the number of dimensions.
Correctly handles the edge cases where the segment contains only 0s or only 1s (cost is 0).
Must be called after Fit(double[,]). This method returns the same value as ComputeLikelihoodMetric(int, int).
// Assuming 'status' data from Fit example
var bernoulliCost = new BernoulliLikelihoodCostFunction().Fit(status);
double costSegment1 = bernoulliCost.ComputeCost(0, 4); // Cost for segment of ~all 1s (should be close to 0)
double costSegment2 = bernoulliCost.ComputeCost(4, 7); // Cost for segment of ~all 0s (should be close to 0)
double costSegmentMix = bernoulliCost.ComputeCost(0, 7); // Cost for mixed segment (positive)
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start
,end
) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start
) is less than 1.
ComputeLikelihoodMetric(int, int)
Computes the likelihood metric for a segment [start, end) based on the Bernoulli negative log-likelihood.
Metric = -2 * [ S*log(S) + (n-S)log(n-S) - nlog(n) ]
where S
= successes, n
= length.
public double ComputeLikelihoodMetric(int start, int end)
Parameters
start
intThe start index of the segment (inclusive).
end
intThe end index of the segment (exclusive).
Returns
- double
The computed likelihood metric for the segment (0 if all successes or all failures, positive otherwise).
Remarks
Calculates the metric in O(D)
time using precomputed prefix sums.
Correctly handles the edge cases where the segment contains only 0s or only 1s (metric is 0).
Must be called after Fit(double[,]). This method returns the same value as ComputeCost(int?, int?).
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start
,end
) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start
) is less than 1.
Fit(double[,])
Fits the cost function to the provided binary (0/1) data by precomputing prefix sums of successes.
public override IPELTCostFunction Fit(double[,] signalMatrix)
Parameters
signalMatrix
double[,]The binary data array to fit (rows=dimensions, columns=time points). Values must be effectively 0 or 1 (within default tolerance).
Returns
- IPELTCostFunction
The fitted BernoulliLikelihoodCostFunction instance.
Remarks
This method performs O(N*D)
computation to calculate prefix sums, enabling O(D)
cost/metric calculation per segment later via ComputeCost(int?, int?) and ComputeLikelihoodMetric(int, int).
It must be called before cost/metric computation methods.
It validates that all input data points are close to either 0 or 1 using the default double-precision epsilon from NumericUtils.
Values are effectively clamped to 0 or 1 for the internal sum calculation.
// Example: Machine status (1=up, 0=down)
double[,] status = { { 1.0, 1.0, 1.0, 0.9999999999, 0.0, 0.0000000001, 0.0, 1.0, 1.0, 1.0 } };
var bernoulliCost = new BernoulliLikelihoodCostFunction();
bernoulliCost.Fit(status);
Exceptions
- ArgumentNullException
Thrown if
signalMatrix
is null.- ArgumentException
Thrown if any data point in
signalMatrix
is not close to 0 or 1 within the default tolerance.
GetSegmentParameterCount(int)
Gets the number of parameters estimated for a Bernoulli model segment. This is 1 parameter (the success probability 'p') per dimension.
public int GetSegmentParameterCount(int segmentLength)
Parameters
segmentLength
intThe length of the segment (unused in this implementation as parameter count is constant per dimension).
Returns
- int
Number of parameters: Number of dimensions * 1.