Class 0x0D: Confidence regions

What is a confidence region?

Given a parameter or parameters fit to data according to a model:


Upper limit:
"The hard disk failure rate is less than 0.01/year (95%CL)."
Lower limit:
"The expected probability of failure is greater than 0.01/mission (95%CL)."
Two-sided limit:
"The allowed 1\sigma\ (3\sigma) CL is \theta_{12}=34.4\pm 1.0 \left(\begin{array}{l}+3.2\\-2.9\end{array}\right)^\circ."

Multi-dimensional confidence region:


Two-parameter confidence regions for fitted parameters of neutrino oscillations (A) and CP violation (B), taken from [KamLAND2008] and [DZero2010], respectively.

General features of confidence regions

Desirable features

Illustration of confidence region coverage

Consider an estimator \hat{\theta} for some constant \theta. Suppose our model tells us that the \hat{\theta} is a gaussian random variable with mean \theta and standard deviation \sqrt{\theta}.

What is the difference between the rms of the estimator and the region with 67% coverage?

(Draw picture.)

Confidence intervals are not unique

The following intervals could all have the same coverage and contain the best fit:

How do we choose confidence intervals (Neyman construction)

(Following section in [PDG-Stat].)

Example 1: coin toss

Suppose we toss a coin N times. The coin may or may not be fair. We have an unknown probability p of the coin coming up heads, 1-p tails.

The maxmimum likelihood estimator for p is


where k is the number of times the coin came up heads.

The probability distribution of k is known:

P_{k} = \frac{N!}{k!(N-k)!} p^{k} (1-p)^{N-k}.

Example 1a: upper limit

We can find the sum

P(k \leq k_2) = \sum_{k=k_2}^{N} P_k(k).

Use that to define the upper limit according to the Neyman procedure. (The lower limit of the interval is set to 0.)


Example 1b: lower limit

We can find the sum

P(k \geq k_1) = \sum_{k=0}^{k_1} P_k(k).

Use that to define the lower limit according to the Neyman procedure. (The lower limit of the interval is set to N.)


Dangers of one-sided limits

Suppose our procedure (perhaps not consiously decided) were to report a "95% CL" upper limit for p if \hat{p} turned out very small, and a "95% CL" lower limit for p if the \hat{p} is near 1.

What's wrong with that? Suppose the coin is fair, and p=0.5 really.

10% of the time, we make report implying that the fair coin is unfair "with 95% CL".

Example 1c: two-sided limit

One common approach: use P(k\geq k_1)=\alpha/2 and P(k\leq k_2)=\alpha/2 to define the intervals.

The only problem is that sometimes this isn't possible. E.g., if we get tails all N times, k=0, the probability at one end is constrained. Then we have to adjust somehow,

The "unified" method (aka "Feldman-Cousins")

Very similar to the two-sided limit, except we pick the edges of the region to have a given log-likelihood. If the parameter space has a boundary, no problem, just adjust the log-likelihood level to encompass enough space.

In general, the p.d.f. of the log-likelihood is generated via MC for each parameter.


Read sections and in [PDG-Stat].

Next Assignment

Build the 90%-CL and 99%-CL confidence regions for the same exponential + background of the assignment from class 11 (aka class 0x0B), with the restriction that the background parameter b must be in the range 0 \leq b < 1 and the mean \mu must be positive.

[KamLAND2008]KamLAND Collaboration, "Precision Measurement of Neutrino Oscillation Parameters with KamLAND", Phys.Rev.Lett.100:221803,2008; arXiv:0801.4589v3 [hep-ex].
[DZero2010]D0 Collaboration, "Evidence for an anomalous like-sign dimuon charge asymmetry", Submitted to Phys. Rev. D, 2010; Fermilab-Pub-10/114-E; arXiv:1005.2757v1 [hep-ex].
[PDG-Stat]"Statistics", G. Cowan, in Review of Particle Physics, C. Amsler et al., PL B667, 1 (2008) and 2009 partial update for the 2010 edition ( ).