Authors: Hamish McPhee, Jean-Yves Tourneret
Submitted to: Signal Processing (under review)
Abstract: Heavy-tailed noise models concern data that is contaminated with outlying values. If the presence of outliers is not considered in the assumed model, the estimation performance of important parameters such as the mean and variance deteriorates. In this work, a misspecified Cram\'er-Rao bound is derived to show the reduced estimation performance when assuming a Gaussian distribution, although some portion of the data is generated by a Gaussian with inflated variance. This provides insight into one heavy-tailed distribution, but other comparisons are made with the Student's t-distribution, where the assumption of one heavy-tailed distribution may be preferred over the other. The misspecified Cram\'er-Rao Bound for joint estimation of the location and scale parameters of the Student's t-distribution is also derived. Analysis of the corresponding maximum likelihood estimators and an approximation of those estimators using the Expectation Maximization algorithm reveals the misspecified estimation performance when the contaminated data is not perfectly modeled by the chosen heavy-tailed distribution. Each of the assumptions is tested on realistic data with labeled outliers to identify the more advantageous assumption between a mixture of Gaussians and a Student's t distribution when the true distribution of measurements is not necessarily a specific heavy-tailed model.