The asymptotic distribution of the likelihood ratio test statistic in two-peak discovery experiments
Abstract
Likelihood ratio tests are widely used in high-energy physics, where the test statistic is usually assumed to follow a chi-squared distribution with a number of degrees of freedom specified by Wilks' theorem. This assumption breaks down when parameters such as signal or coupling strengths are restricted to be non-negative and their values under the null hypothesis lie on the boundary of the parameter space. Based on a recent clarification concerning the correct asymptotic distribution of the likelihood ratio test statistic for cases where two of the parameters are on the boundary, we revisit the the question of significance estimation for two-peak signal-plus-background counting experiments. In the high-energy physics literature, such experiments are commonly analyzed using Wilks' chi-squared distribution or the one-parameter Chernoff limit. We demonstrate that these approaches can lead to strongly miscalibrated significances, and that the test statistic distribution is instead well described by a chi-squared mixture with weights determined by the Fisher information matrix. Our results highlight the need for boundary-aware asymptotics in the analysis of two-peak counting experiments.
Summary
This paper addresses the problem of accurately estimating the significance of signals in high-energy physics experiments, specifically focusing on "two-peak discovery experiments" where the goal is to identify two distinct signal peaks above a background. The authors demonstrate that the commonly used assumption of a chi-squared distribution for the likelihood ratio test (LRT) statistic, based on Wilks' theorem, can lead to significant miscalibration when signal strengths are constrained to be non-negative and lie on the boundary of the parameter space under the null hypothesis (no signal). They show that the correct asymptotic distribution for the LRT statistic in such cases is a mixture of chi-squared distributions with weights determined by the Fisher information matrix. They also highlight the importance of using a proper parametrization based on a Poisson point process (PPP) that models the experiment as observations over exposure intervals, rather than treating individual events as independent units. The authors employ Monte Carlo simulations of a two-peak signal-plus-background experiment to compare the empirical distribution of the LRT statistic with various theoretical distributions, including the standard chi-squared, the Chernoff limit (for one parameter on the boundary), and the chi-squared mixture they advocate. They analyze scenarios with and without nuisance parameters (background shape parameters) and demonstrate the impact of profiling these nuisance parameters on the Fisher information matrix and the resulting chi-squared mixture weights. The key finding is that using the incorrect asymptotic reference distribution can lead to either overly optimistic (false discoveries) or overly conservative (missed discoveries) significance estimates. The paper contributes by providing a clear explanation of the correct asymptotic theory for two-peak discovery experiments and highlighting the importance of boundary-aware asymptotics and proper parametrization in statistical inference.
Key Insights
- •Wilks' theorem, commonly used in high-energy physics for approximating the distribution of the likelihood ratio test statistic, breaks down when parameters are constrained to be non-negative and lie on the boundary of the parameter space under the null hypothesis.
- •The correct asymptotic distribution for the LRT statistic in two-peak discovery experiments is a mixture of chi-squared distributions (χ²₀, χ²₁, χ²₂) with weights determined by the Fisher information matrix.
- •A statistically consistent formulation is obtained by treating the exposure interval as the observational unit and modelling the experiment as a Poisson point process (PPP). Treating single events as the observational unit leads to a vanishing Fisher information in the large-sample limit, violating Bartlett's identities.
- •Profiling nuisance parameters (e.g., background shape) significantly impacts the Fisher information matrix and, consequently, the weights of the chi-squared mixture distribution. The effective Fisher information matrix, obtained via the Schur complement, must be used to account for the effect of profiling.
- •The correlation between the signal strengths of the two peaks (ρ) influences the form of the asymptotic distribution. When both signal peaks are parameters of interest, the analysis requires a chi-squared mixture with weights determined by the profiled Fisher information.
- •The sign of the correlation between the parameter of interest and the nuisance parameter determines the mixture. For negative correlation, the asymptotic distribution of λLR can no longer be expressed as a chi-squared mixture.
- •Using the incorrect asymptotic reference distribution can lead to either overly optimistic or overly conservative significance estimates, potentially resulting in false discoveries or missed discoveries.
Practical Implications
- •Researchers analyzing two-peak signal-plus-background experiments in high-energy physics and related fields (e.g., gamma-ray astronomy, collider spectroscopy) should use the chi-squared mixture distribution with weights determined by the (profiled) Fisher information matrix to accurately estimate the significance of their findings.
- •Practitioners should adopt the exposure-based parametrization (PPP) to ensure the validity of asymptotic theory and avoid issues with vanishing Fisher information in the large-sample limit.
- •When performing per-peak tests, where one peak is treated as a nuisance parameter, it is crucial to account for the fact that the nuisance parameter is still constrained to be non-negative and lies on the boundary of the parameter space. The appropriate reference distribution depends on the sign of the correlation between the parameter of interest and the nuisance parameter.
- •In situations where the correlation between the boundary parameters is negative, the analysis requires a chi-squared mixture with weights determined by the profiled Fisher information.
- •Future research could explore methods for approximating the asymptotic distribution of the LRT statistic in more complex scenarios with multiple parameters on the boundary and/or non-standard correlation structures.