## Designing Quantitative Experiments

(or How to tell the truth with statistics)

A First Look
People who do science ask questions. Which causes produce which effects? How are phenomena related? How can we usefully describe what we observe? Although particular quantitative details of the relationships are also important, these are rarely the first stage of a scientific investigation. erving the Effects of Solar Ultraviolet Radiation on Yeast,showed unequivocally that exposure to sunlight damages a certain strain of yeast. No statistical evidence is needed to establish the effect, but lots of further investigation is suggested: could it be the rise in temperature that damages the cells? Would an incandescent light bulb hurt them in the same way? Why does a transparent glass plate protect the cells? Following up these leads, and others that occur to you is real science, and it is science without statistics. Pursuit of question and answer, which should be the basis for science education, is accessible even to young students for whom "mean," and "standard deviation," and "confidence level," would be merely jargon .

We've heard some science educators assume that if a statement is not quantified then it is not scientific, that only sentences like "The probability of 0.95 that ..." are scientific. Such a view can damage science education because it divorces thinking about science from all the rest of our thinking about the world. Scientific thinking is ordinary logic, ordinary reasoning, ordinary thinking; an answer is scientific if it can be tested by ordinary people.
Consider the questions raised by the experiment on the dose dependence of survival rates. Earlier experiments establish that UV radiation affects yeast cells, and now the question is how the dose affects damage. The first response should be qualitative, phrased in ordinary language, rather than with numbers or equations: the larger the dose of UV, the lower the surviving fraction of cells. This rough relationship is not foreordained and is the first important result of this experiment. Even young students can discuss this observation in scientific, qualitative terms. Slightly more advanced students may use the statistical ideas of "average" and "standard deviation" to present data simply and think more specifically about the results.
In thinking about how to design and run classroom experiments some statistical thinking is called for. For example, you need to estimate how many cells are going to survive each dose of UV so that you use reasonable dilution factors. It is useful also to have a little knowledge of a "normal distribution," mainly just that about 1/3 of the data points will typically lie outside one standard deviation from the mean, about 1/20 will lie outside two standard deviations, and less than 3/100 will lie more than three standard deviations from the mean. This allows you to disregard points more than 3 or 4 standard deviations from the mean with some confidence. (But you ought to also be sensitive to the possibility that the kid with the wild plate may have run into something interesting.) Finally, to appreciate the power of having lots of data. The larger your sample size, the smaller your relative error bars get and the more clear-cut your results become.

#### A Second Look

Students of science try to find out what is happening, describe it usefully, and, in some sense, understand what is going on. What is happening in the experiments in this unit of study? We find that exposure to UV light damages some strains of yeast so that they can't reproduce, that this exposure can cause mutations, and that exposure to sunlight can induce some repair of the damage. We are doing the experiments to try to understand these phenomena better, in some cases even in a quantitative way. Although we are not testing any hypothesis, we may form one during our investigation. The statistical language of hypothesis testing is unnecessary for describing the experiments and their results. When we talk of "statistics" we usually refer only to concepts such as "average," "experimental error," or "standard deviation." Terms like "confidence levels," "t-tests," and even "chi-square" are rarely needed. When you do use statistical concepts, don't turn them into hard-and-fast rules. For "error," you may sometimes use the whole range of results from a class, or instead of reporting the "average," you may record the median.
We could recast our experiments as hypothesis testing. Instead of measuring the surviving fraction of cells vs. dose, for example, we could try to discredit the hypothesis that the log of the surviving fraction varies linearly with exposure time. Such an approach would make all the statistics learned in ed psych classes applicable. Wouldn't this be a good thing? No. First, it would distort what scientists really do, namely, try to figure stuff out, to discover rather than hypothesize. Second, it would require talking and thinking about science in a different way than we use in the rest of our lives, and this would be misleading: science uses ordinary logic and ordinary speech, it is not something esoteric reserved to a priesthood of the elect. Third, framing the work as hypothesis testing would be harder. Asking "what's going on?" is easier than asking "what is the probability that the data are inconsistent with the hypothesis?". And finally, hypothesis testing closes off other questions that the more open-ended "what's happening?" encourages.
Having thoroughly discredited the use of sophisticated statistics in experiments, let's soften the stance a bit. Sometimes we ARE "hypothesis testing." In the photoreactivation experiment, for example, one (but only one) of the questions we should ask is "Is there any difference between the cells exposed to sunlight and those kept in the dark?" Here a chi-square test or a t-test would be appropriate although unnecessary (if the experiment was well-designed.) You might also use a chi-square to compare data to an existing theory such as Mendelian genetics. Chi-square will enable you to quantitatively compare segregation ratios obtained in crosses with the ratios predicted by a genetic model (e.g. unlinked genes).
If we can use chi-square to evaluate how well a given curve fits a set of data, we could use the data both to obtain the curve and to evaluate how good the fit is. Students at preparatory levels, however, would struggle to do this properly. They would find it especially hard to handle the effect of points with differing errors and decide on the right number of "degrees of freedom."

#### Summary

There is no prescription for doing science, no infallible list of rules. Teach your students to use common sense, think hard and be honest. Encourage them to behave like scientists, beginning their work by finding out what happens and describing it in everyday language. As their knowledge builds, you may begin to incorporate quantitative elements. Use this process as a guide:

1. Observe effects and investigate causes.
2. Describe relationships qualitatively.
3. Plot all results on graphs. Experiment with linear, log-linear, and log-log graphs.
4. Make graphs of averages (or other estimates) using standard deviations (or another measure of error) as error bars.
5. Look for the "best fit" between a simple curve (often a straight line on some graph) and the data. First, by eye; then, perhaps, using "least squares."
6. If a theory exists and specifies parameters for the curve in #5, then you may try to determine a "confidence interval" for some of the parameters. Remind your students that scientists rarely determine "confidence intervals." Glance through Physical Review or Genetics to see how many error bars and how few confidence levels there are.

#### A Last Note

We think the urge to quantify all scientific questions and answers arises from the knowledge that science is partly institutionalized doubt, that we are never absolutely sure about anything, and that therefore it is wonderful that we can go so far when we are always beset by uncertainty. Science students can benefit enormously when this philosophy becomes central to the classroom, because it stimulates questioning. But if we suggest to students that this uncertainty means that we don't really learn anything with all our work, or that we have to talk and think differently about the natural world than about anything else, then this preoccupation with quantifying our uncertainty becomes a barrier to science education.