Median Statistical Analysis of Non-Gaussian Astrophysical and Cosmological Data Compilations
Kansas State University
Physics and Mathematics Major
Mentored by Dr. Bharat Ratra
Scientific integrity relies on reproducible, accurate results. In cosmology, however, data is based almost exclusively on estimates (oftentimes stacked with other estimates) and therefore ensuring accurate results can be challenging. Detecting flaws in collected data on a large scale would be ideal. Statistical analysis can allow us to compare the distribution of our errors from collected samples with a Gaussian distribution, which convention details that random errors must approach, to quantify whether the assignment of error margins is appropriate. In physics, the width of the error margin is often more important than the central reported value.
The standard model of Big Bang Nucleosynthesis (BBN) predicts the production of certain quantities of Deuterium, 3He, 4He, and 7Li in the first 20 minutes after the Big Bang. Using observations of the Cosmic Microwave Background (CMB), we can compare our calculations with an observed measure of these elements roughly 300,000 years after the Big Bang occurred. These observations agree with calculated levels of Deuterium, 3He, and 4He, but there's a discrepancy in the 7Li measurement by roughly a factor of three. This could indicate that there's something fundamentally flawed with the BBN modeling of the early universe. 7Li has also been preserved in old, main sequence stars below 2.5x106 K. By sampling these stars, we can determine an alternative data set to get to the core of the lithium abundance problem. However, the 7Li measured from the stars has a mean measurement of roughly 2.2 as compared to the CMB measurement of 2.7.
Fig 1: M. Spite, F. Spite, and P. Bonifacio. The Cosmic Lithium Problem: An Observer's Perspective. Mem. S.A.It. Suppl., 22:9, 2012.
Previous chi-squared analysis of a data set of 66 stars has revealed a tendency toward a non-Gaussian distribution in the error margins. This could mean that these margins are too narrow on aggregate and the .5 discrepancy between stellar and CMB observations is less significant than it first appears. This summer, I performed a Kolmogorov-Smirnov (KS) goodness of fit test on the same data set as the previous study to determine the probability of rejecting certain distribution functions found in nature (Gaussian, Cauchy, Laplace, and Student's t). My results found that the Gaussian and Student's t distributions are the ones which we are least able to reject as possible distributions, in agreement with the previous analysis.
Special thanks to my mentor, Dr. Bharat Ratra, and the graduate student who I would have been lost without, Tia Camarillo.