Class PoissonLikelihoodCostFunction
- Namespace
- SignalSharp.CostFunctions.Cost
- Assembly
- SignalSharp.dll
Represents a cost function based on the Poisson negative log-likelihood for the PELT algorithm. This cost function is sensitive to changes in the rate (mean) of events in count data.
public class PoissonLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
- Inheritance
-
PoissonLikelihoodCostFunction
- Implements
- Inherited Members
Remarks
This cost function assumes that the data within each segment represents counts following a Poisson distribution
independently for each dimension, with a constant rate parameter (λ) for that segment and dimension.
It calculates the cost based on the negative log-likelihood of the segment data given its estimated rate (Maximum Likelihood Estimate - MLE).
The MLE for the rate λ in a segment [start, end) of length n = end - start for a given dimension is the sample mean:
λ_hat = (Sum_{i=start}^{end-1} signal[dim, i]) / n = S / n, where S is the sum of counts for that dimension.
The likelihood metric used for BIC/AIC calculations, proportional to -2 * log-likelihood, is calculated as:
Metric(start, end) = Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ]
where S = Sum_{i=start}^{end-1} signal[dim, i] is the sum of counts in the segment for that dimension, and n = end - start is the segment length.
This formula assumes the convention 0 * log(0) = 0, which is handled by setting the metric contribution to 0 when S=0 for a dimension.
The term Sum log(signal[dim, i]!) from the full likelihood is omitted as it depends only on the data points themselves.
The ComputeCost(int?, int?) method returns this same metric value.
This cost/metric is calculated efficiently using precomputed prefix sums of the signal,
allowing O(D) calculation per segment after an O(N*D) precomputation step during Fit(double[,]),
where D is the number of dimensions and N is the number of time points.
Consider using the Poisson Likelihood cost function when:
- Your data represents counts of events per interval (e.g., website hits per day, defects per batch, calls per hour) for one or more dimensions.
- You expect changes in the average rate of these events.
- The data within segments can be reasonably approximated by a Poisson distribution (variance roughly equals mean).
- The input data contains non-negative values (counts cannot be negative).
Note: While the function accepts double inputs, Poisson counts are theoretically non-negative integers. This implementation requires input data to be effectively non-negative (values >= -Epsilon). Values slightly below zero but within tolerance will be clamped to zero. Significantly negative values will cause an exception during Fit(double[,]).
Constructors
PoissonLikelihoodCostFunction()
Initializes a new instance of the PoissonLikelihoodCostFunction class.
public PoissonLikelihoodCostFunction()
Properties
SupportsInformationCriteria
Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.
public bool SupportsInformationCriteria { get; }
Property Value
Methods
ComputeCost(int?, int?)
Computes the cost for a segment [start, end) based on the Poisson negative log-likelihood.
The cost is Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ], where S is the sum of counts and n is the length.
public override double ComputeCost(int? start = null, int? end = null)
Parameters
startint?The start index of the segment (inclusive). If null, defaults to 0.
endint?The end index of the segment (exclusive). If null, defaults to the length of the data.
Returns
- double
The computed cost for the segment.
Remarks
Calculates the cost in O(D) time using precomputed prefix sums, where D is the number of dimensions.
Handles the segmentSum = 0 case correctly based on the limit x*log(x) -> 0 as x -> 0, resulting in zero cost contribution for dimensions with zero total count.
Must be called after Fit(double[,]). This method returns the same value as ComputeLikelihoodMetric(int, int).
// Assuming 'counts' data from Fit example
var poissonCost = new PoissonLikelihoodCostFunction().Fit(counts);
double costSegment1 = poissonCost.ComputeCost(0, 4); // Cost for segment with lower counts
double costSegment2 = poissonCost.ComputeCost(4, 7); // Cost for the segment with higher counts
// Example with zero-sum segment
double[,] zeroCounts = { { 0, 0, 0, 5, 5 } };
var zeroCost = new PoissonLikelihoodCostFunction().Fit(zeroCounts);
double costZeroSeg = zeroCost.ComputeCost(0, 3); // Should be 0.0
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start,end) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start) is less than 1.
ComputeLikelihoodMetric(int, int)
Computes the likelihood metric for a segment [start, end) based on the Poisson negative log-likelihood.
The metric is Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ], where S is the sum of counts and n is the length.
public double ComputeLikelihoodMetric(int start, int end)
Parameters
startintThe start index of the segment (inclusive).
endintThe end index of the segment (exclusive).
Returns
- double
The computed likelihood metric for the segment.
Remarks
Calculates the metric in O(D) time using precomputed prefix sums.
Handles the segmentSum = 0 case correctly.
Must be called after Fit(double[,]). This method returns the same value as ComputeCost(int?, int?).
Exceptions
- UninitializedDataException
Thrown when prefix sums are not initialized (Fit(double[,]) not called).
- ArgumentOutOfRangeException
Thrown when the segment indices (
start,end) are out of bounds.- SegmentLengthException
Thrown when the segment length (
end - start) is less than 1.
Fit(double[,])
Fits the cost function to the provided count data by precomputing prefix sums.
public override IPELTCostFunction Fit(double[,] signalMatrix)
Parameters
signalMatrixdouble[,]The count data array to fit (rows=dimensions, columns=time points). Values must be effectively non-negative (>= -Epsilon).
Returns
- IPELTCostFunction
The fitted PoissonLikelihoodCostFunction instance.
Remarks
This method performs O(N*D) computation to calculate prefix sums, enabling O(D) cost/metric calculation per segment later.
It must be called before cost/metric computation methods.
It validates that all input data points are non-negative within a small tolerance (Epsilon). Values slightly below zero but within tolerance will be clamped to zero for the sum.
// Example: Number of website hits per hour
double[,] counts = { { 5, 8, 6, 7, 25, 30, 28, 10, 9, 12 } };
var poissonCost = new PoissonLikelihoodCostFunction();
poissonCost.Fit(counts);
// Example with near-zero value
double[,] countsNearZero = { { 5, 8, 1e-10, 7, 25, 30, -1e-11, 10, 9, 12 } };
poissonCost.Fit(countsNearZero); // Should work
Exceptions
- ArgumentNullException
Thrown if
signalMatrixis null.- ArgumentException
Thrown if any data point in
signalMatrixis less than -Epsilon.
GetSegmentParameterCount(int)
Gets the number of parameters estimated for a Poisson model segment. This is 1 parameter (the rate 'λ') per dimension.
public int GetSegmentParameterCount(int segmentLength)
Parameters
segmentLengthintThe length of the segment (unused).
Returns
- int
Number of parameters: Number of dimensions * 1.