Skip to content

Latest commit

 

History

History
270 lines (225 loc) · 17.4 KB

unicic.org

File metadata and controls

270 lines (225 loc) · 17.4 KB

Applying Feldman-Cousins Unified Approach to Multivariate Neutrino Oscillation Model

Multivariate F-C Formalism

The notation here partially follows that used in the Feldman-Cousins “unified” confidence interval construction (F-C/UA). A model $Npred(\vec{p})$ is defined which predicts an expected value for a dataset at a given point in the model parameter space $\vec{p}$ and which takes the same form as a (singular) measured and binned dataset in equation eqn:nmeas where $n_i$ counts the number of events in bin $i$.

\begin{equation} Nmeas ≡ \{n_i\},\ i ∈ \{0, nbins-1\} \end{equation}

The “unified approach” of Feldman-Cousins (FC) for constructing confidence regions defines an ordering principle based on the likelihood ratio in eqn:rk.

\begin{equation} R_k(\vec{p}) = \frac{\mathcal{P}(\ N_k\ |\ Npred(\vec{p})\ )}{\mathcal{P}(\ N_k\ |\ Npred(\vec{q}best)\ )} \end{equation}

There, $\vec{q} = \vec{q}best$ maximizes the likelihood $\mathcal{P}(\ N_k\ |\ Npred(\vec{q})\ )$ for dataset $N_k$ to be produced give the prediction $Npred(\vec{q})$. The maximizing $\vec{q}best$ is found over the allowed model parameter space or in practice a predefined subset of the possibly infinite parameter space. The $N_k,\ k ∈ [1,K]$ is one of $K$ results of a “toy simulation” performed at $\vec{p}$ which produces data in the same form as $Nmeas$, namely the result of each “toy” represents a fluctuation applied to $Npred(\vec{p})$. Note, $N_k$ in both numerator and denominator of $R_k$ is evaluated at parameter point $\vec{p}$.

FC says that in the Gaussian regime the likelihood ratio can be approximated as a $Δ χ^2$ as in equation eqn:rkchi.

\begin{equation} R_k(\vec{p}) ≈ Δ χ_k^2(\vec{p}) = χ^2(N_k, \vec{p}) - χ^2(N_k,\vec{q}best) \end{equation} There, for example, each of the two terms in the difference is defined at a point in parameter space $\vec{q}$ as in equation eqn:chi.

\begin{equation} χ^2(N_k, \vec{q}) = (N_k - Npred(\vec{q}))^\intercal ⋅ Σ-1 ⋅ (N_k - Npred(\vec{q})) \end{equation} Here, $Σ$ represents a covariance matrix which may include terms for statistical and systematic uncertainty and may be a function of the parameter space point $\vec{p}$ under consideration or the fluctuated toy dataset $N_k$ or in the case of the $χ^2(N_k, \vec{q}best)$, every point in parameter space $\vec{q}$ tested in searching for $\vec{q}best$.

Next, a critical $Δ χ_c^2(\vec{p})$ is calculated such that it is greater than exactly $α$ (eg 90%) of the entries in the set $\{Δ χ_k^2(\vec{p})\}$ and also a $Δ χ^2$ for the measurement is calculated as in equation eqn:chim.

\begin{equation} Δ χ^2meas(\vec{p}) = χ^2(Nmeas, \vec{p}) - χ^2(Nmeas, \vec{q}best) \end{equation} Again, $\vec{q}best$ is found by maximizing the likelihood as was done above with the $Δ χ_k^2(\vec{p})$. Finally, the set of points $\{\vec{p}\}$ spanning the CR is defined as in equation eqn:cr.

\begin{equation} \{\ \vec{p}\ |\ Δ χ^2meas(\vec{p}) < Δ χ^2_c(\vec{p})\ \} \end{equation}

Algorithms

This section provides a summary of the multivariate F-C/UA expressed in terms of functional algorithm pseudocode. A portion of a function which must be provided by a specific application are elided. Where stated, some functions implement a particular choice among a set of valid options. Simple function with purposes understood from context may be called without explicitly being defined.

The first function \textsc{Predict}alg:predict represents the parameterized model of the observed data. It must transform an arbitrary point $\vec q$ in parameter space into a point in measurement space. The output is the expectation value of the measurement given the parameter point and does not represent any random fluctuations.

In \textsc{MostLikely}alg:mostlikely, given an expectation value $Npred$ or a measurement (be it a fluctuation of the expectation $N_k$ or measured data $Nmeas$, there is one point in the parameter space $\vec{q}best$ which is most likely to produce the measurement. This point is found by maximizing that likelihood or, in the case shown below, minimizing a corresponding $χ^2$. The minimization strategy is typically chosen to be a grid search over the full parameter space. This may be augmented or replaced by more sophisticated optimization strategies. The accuracy and precision in finding $\vec{q}best ↔ χ^2min$ will reflect into the correctness of the final confidence regions. This function is called frequently and its performance optimization is critical.

The $χ^2$ is constructed from a covariance matrix in alg:chi2. The construction is in the form of a systematic and a statistical term. The systematic term is defined in \textsc{SystVariance}alg:systvariance and is composed from a fractional systematic matrix which is independent from the parameterized model and which is multiplied to an expected measurement.

The remaining algorithms are presented in more fully defined forms though some may still allow application-specific modifications. The statistical term in the covariance is constructed in \textsc{StatVariance}alg:statvariance. The form of this covariance matrix term is follows what is described in the combined Nyeman-Pearson Chi-square construction. Finally, the full covariance matrix itself is simply the sum of the statistical and systematic terms and its construction is shown in \textsc{Covariance}alg:covariance.

In order to finally construct the confidence region over a parameter space of multiple dimensions we must resort to Monte Carlo integration to determine the $χ^2$ distribution at any given point in that parameter space. Each call to \textsc{Fluctate}alg:fluctuate produces a single measurement from a “toy” experiment. The “toy” measurement is produced from statistically and systematically fluctuating the expectation value at a given point in parameter space. The systematic uncertainty encoded by \textsc{SystVariance}alg:systvariance matrix is fluctuated assuming its eigenvalues are Gaussian distributed and this result is added to the expectation value of the measure at that point in parameter space. The sum is then fluctuated by interpreting each element of this systematically biased expectation as a Poisson mean.

The \textsc{Chi2}alg:chi2 function produces a scalar value that scores how consistent a measure $N$ is with a point in parameter space $\vec{q}$, specifically with the expectation value $Npred$ at $\vec{q}$. It takes the usual form of vector differences between the two measures which are contracted on the inverse of the covariance matrix. The \textsc{Chi2} function is at the center of many iterations and thus optimizing is performance, and particularly that of the matrix inversion, is very important for an overall fast calculation.

The \textsc{DeltaChi2}alg:deltachi2 function provides a $Δ χ^2$ value comparing two $χ^2$ values as a function of a measure $N$ and a point in parameter space $\vec{p}$. The first, $χ^2null$, is between the measure and the predicted expectation of the measure $Npred$ at $\vec p$. The second, $χ^2min$, is between the measure and the predicted expectation of the measure at $\vec{q}best$ as found by alg:mostlikely.

The \textsc{SampleDeltaChi2}alg:sampledeltachi2 function applies the Monte Carlo integration method to estimate the $Δ χ^2$ distribution at a given point in parameter space $\vec p$. The MC will calculate over $ntoys$ of “toy” experiments with the function in alg:fluctuate.

When one wishes to draw a single confidence region boundary a critical $Δ χ^2_c$ for a specific confidence level $α$ can be found from a set of sampled $Δ χ^2$ values at a given point $\vec p$ in parameter space. This is shown in \textsc{CriticalDeltaChi2}alg:criticaldeltachi2. When the $Δ χ^2$ value constructed from a measure from a real experiment for a given point $\vec p$ compares less than the $Δ χ^2_c$ at that point then that point is included in the confidence region at the confidence level $α$. This is illustrated in \textsc{ConfidenceRegion}alg:confidenceregion. A benefit of eagerly applying a determined value of $α$ is that the array $samples$ of $Δ χ^2$ values over the toys at one point in parameter space $\vec p$ may be discarded. Once the comparison against $Δ χ_c^2$ is made, the only data retained for the point $\vec p$ is Boolean (to be in the CR or not to be in the CR).

For large $ntoys$ or large parameter space, this may provde substantial reduction in required storage. However, it also reduces flexibility to draw confidence regions at new confidence levels determined in the future. The \textsc{ConfidenceManifold}alg:confidencemanifold is a variant on alg:criticaldeltachi2 and alg:confidenceregion which requires all “toy” results and produces at each point in parameter space the level of confidence that the point is consistent with the measurement. Any confidence level may then be selected and via interpolation on the scalar field its confidence region boundary may be drawn.

Application

t.b.d. “old” to “new” world. application of oscillation. Initial unbinned event approach and binned approach.