Table of Contents

Class PoissonLikelihoodCostFunction

Namespace
SignalSharp.CostFunctions.Cost
Assembly
SignalSharp.dll

Represents a cost function based on the Poisson negative log-likelihood for the PELT algorithm. This cost function is sensitive to changes in the rate (mean) of events in count data.

public class PoissonLikelihoodCostFunction : CostFunctionBase, ILikelihoodCostFunction, IPELTCostFunction
Inheritance
PoissonLikelihoodCostFunction
Implements
Inherited Members

Remarks

This cost function assumes that the data within each segment represents counts following a Poisson distribution independently for each dimension, with a constant rate parameter (λ) for that segment and dimension. It calculates the cost based on the negative log-likelihood of the segment data given its estimated rate (Maximum Likelihood Estimate - MLE).

The MLE for the rate λ in a segment [start, end) of length n = end - start for a given dimension is the sample mean: λ_hat = (Sum_{i=start}^{end-1} signal[dim, i]) / n = S / n, where S is the sum of counts for that dimension.

The likelihood metric used for BIC/AIC calculations, proportional to -2 * log-likelihood, is calculated as: Metric(start, end) = Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ] where S = Sum_{i=start}^{end-1} signal[dim, i] is the sum of counts in the segment for that dimension, and n = end - start is the segment length. This formula assumes the convention 0 * log(0) = 0, which is handled by setting the metric contribution to 0 when S=0 for a dimension. The term Sum log(signal[dim, i]!) from the full likelihood is omitted as it depends only on the data points themselves. The ComputeCost(int?, int?) method returns this same metric value.

This cost/metric is calculated efficiently using precomputed prefix sums of the signal, allowing O(D) calculation per segment after an O(N*D) precomputation step during Fit(double[,]), where D is the number of dimensions and N is the number of time points.

Consider using the Poisson Likelihood cost function when:

  • Your data represents counts of events per interval (e.g., website hits per day, defects per batch, calls per hour) for one or more dimensions.
  • You expect changes in the average rate of these events.
  • The data within segments can be reasonably approximated by a Poisson distribution (variance roughly equals mean).
  • The input data contains non-negative values (counts cannot be negative).

Note: While the function accepts double inputs, Poisson counts are theoretically non-negative integers. This implementation requires input data to be effectively non-negative (values >= -Epsilon). Values slightly below zero but within tolerance will be clamped to zero. Significantly negative values will cause an exception during Fit(double[,]).

Constructors

PoissonLikelihoodCostFunction()

Initializes a new instance of the PoissonLikelihoodCostFunction class.

public PoissonLikelihoodCostFunction()

Properties

SupportsInformationCriteria

Indicates that this cost function provides likelihood metrics suitable for BIC/AIC.

public bool SupportsInformationCriteria { get; }

Property Value

bool

Methods

ComputeCost(int?, int?)

Computes the cost for a segment [start, end) based on the Poisson negative log-likelihood. The cost is Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ], where S is the sum of counts and n is the length.

public override double ComputeCost(int? start = null, int? end = null)

Parameters

start int?

The start index of the segment (inclusive). If null, defaults to 0.

end int?

The end index of the segment (exclusive). If null, defaults to the length of the data.

Returns

double

The computed cost for the segment.

Remarks

Calculates the cost in O(D) time using precomputed prefix sums, where D is the number of dimensions. Handles the segmentSum = 0 case correctly based on the limit x*log(x) -> 0 as x -> 0, resulting in zero cost contribution for dimensions with zero total count. Must be called after Fit(double[,]). This method returns the same value as ComputeLikelihoodMetric(int, int).

// Assuming 'counts' data from Fit example
var poissonCost = new PoissonLikelihoodCostFunction().Fit(counts);
double costSegment1 = poissonCost.ComputeCost(0, 4); // Cost for segment with lower counts
double costSegment2 = poissonCost.ComputeCost(4, 7); // Cost for the segment with higher counts

// Example with zero-sum segment
double[,] zeroCounts = { { 0, 0, 0, 5, 5 } };
var zeroCost = new PoissonLikelihoodCostFunction().Fit(zeroCounts);
double costZeroSeg = zeroCost.ComputeCost(0, 3); // Should be 0.0

Exceptions

UninitializedDataException

Thrown when prefix sums are not initialized (Fit(double[,]) not called).

ArgumentOutOfRangeException

Thrown when the segment indices (start, end) are out of bounds.

SegmentLengthException

Thrown when the segment length (end - start) is less than 1.

ComputeLikelihoodMetric(int, int)

Computes the likelihood metric for a segment [start, end) based on the Poisson negative log-likelihood. The metric is Sum_dimensions [ 2 * ( S - S * log(S) + S * log(n) ) ], where S is the sum of counts and n is the length.

public double ComputeLikelihoodMetric(int start, int end)

Parameters

start int

The start index of the segment (inclusive).

end int

The end index of the segment (exclusive).

Returns

double

The computed likelihood metric for the segment.

Remarks

Calculates the metric in O(D) time using precomputed prefix sums. Handles the segmentSum = 0 case correctly. Must be called after Fit(double[,]). This method returns the same value as ComputeCost(int?, int?).

Exceptions

UninitializedDataException

Thrown when prefix sums are not initialized (Fit(double[,]) not called).

ArgumentOutOfRangeException

Thrown when the segment indices (start, end) are out of bounds.

SegmentLengthException

Thrown when the segment length (end - start) is less than 1.

Fit(double[,])

Fits the cost function to the provided count data by precomputing prefix sums.

public override IPELTCostFunction Fit(double[,] signalMatrix)

Parameters

signalMatrix double[,]

The count data array to fit (rows=dimensions, columns=time points). Values must be effectively non-negative (>= -Epsilon).

Returns

IPELTCostFunction

The fitted PoissonLikelihoodCostFunction instance.

Remarks

This method performs O(N*D) computation to calculate prefix sums, enabling O(D) cost/metric calculation per segment later. It must be called before cost/metric computation methods. It validates that all input data points are non-negative within a small tolerance (Epsilon). Values slightly below zero but within tolerance will be clamped to zero for the sum.

// Example: Number of website hits per hour
double[,] counts = { { 5, 8, 6, 7, 25, 30, 28, 10, 9, 12 } };
var poissonCost = new PoissonLikelihoodCostFunction();
poissonCost.Fit(counts);

// Example with near-zero value
double[,] countsNearZero = { { 5, 8, 1e-10, 7, 25, 30, -1e-11, 10, 9, 12 } };
poissonCost.Fit(countsNearZero); // Should work

Exceptions

ArgumentNullException

Thrown if signalMatrix is null.

ArgumentException

Thrown if any data point in signalMatrix is less than -Epsilon.

GetSegmentParameterCount(int)

Gets the number of parameters estimated for a Poisson model segment. This is 1 parameter (the rate 'λ') per dimension.

public int GetSegmentParameterCount(int segmentLength)

Parameters

segmentLength int

The length of the segment (unused).

Returns

int

Number of parameters: Number of dimensions * 1.