I'm going to assume you already know about the following:
See References.
Suppose the correct model for the distribution of light bulb lifetimes [*] is
The expectation value for is the mean,
The expectation value for the variance is
[*] | Note: this is almost certainly not a good model for light bulb lifetimes. |
A statistic is a quantity depending on random variables. A statistic is therefore itself a random variable with its own p.d.f.
Mean:
Mean of squares:
Cumulative distribution statistic:
The last is an example of a statistic that is also a function of a
parameter . It should approach the cumulative distribution function
as
.
Using the same p.d.f. as in the earlier example, and assuming independent light bulb lifetimes,
The statistical moments are not necessarily the least biased, most efficient, or most robust estimators.
Another statistic one can construct for a data set is the likelihood:
The inverse of the covariance matrix
can be estimated
using
For large samples or perfectly Gaussian probabilities,
has a "Gaussian form",
becomes parabolic in
.
Finding proper confidence intervals in the more general case will be discussed in a later class.
It is usually easier to maximize
Equivalently, one minimizes the "effective chi-squared" defined as
. For Gaussian statistics, this is exactly the
chi-squared, if the standard deviations are known.
It is important to include all dependence on in
,
including normalization factors.
For ,
Find the maximum likelihood estimator for .
For real ,
Find the maximum likelihood estimator for and
.
For integer ,
Find the maximum likelihood estimator for .
For real ,
Find the maximum likelihood estimator for and
.
Note the estimator for is asymptotically unbiased in the limit
of large
, but not unbiased at finite
. The bias can be corrected
without degrading the asymptotic RMS of the estimator.
Once you know how to minimize a function, and you know the p.d.f.s of the data in your model, then numerical implementation of the maximum likelihood method is easy. Just write the function to calculate:
Then minimize it.
Note: if you have too many parameters, you might need to simplify it some, perhaps by pre-fitting some of the parameters in some faster way.
Make a maximum likelihood fit of the data in "dataset 1" provided on the course web page to the following model:
By "make a maximum likelihood fit", I mean "estimate the parameters
using the maximum likelihood method":
Make a maximum likelihood fit of the data in "dataset 2" provided on the course web page to the following model:
By "make a maximum likelihood fit", I mean "estimate the parameters
using the maximum likelihood method":
In the following, (R) indicates a review, (I) indicates an introductory text.
(R) "Probability", G. Cowan, in Review of Particle Physics, C. Amsler et al., PL B667, 1 (2008) and 2009 partial update for the 2010 edition (http://pdg.lbl.gov).
See also general references cited in PDG-Stat.
(R) "Probability", G. Cowan, in Review of Particle Physics, C. Amsler et al., PL B667, 1 (2008) and 2009 partial update for the 2010 edition (http://pdg.lbl.gov).
See also general references cited in PDG-Prob.