Preface

This book evolved from the notes for a course of the same title that we've taught for the last thirty years at the University of Wisconsin to graduate students in cancer biology, genetics, molecular biology, and other biomedical programs. We began teaching that course to counter the notion held by (some) molecular biologists that, "if you need statistics to analyze your data, you should have done a better experiment." We hope to convince you that by considering the statistical issues inherent in your research, you will be able to design better experiments and develop a sense of the degree of confidence you can place in the conclusions you draw from your studies.

Part of the antipathy biologists have for statistics arises from the way statistical analysis is taught in many introductory courses. Generally, the focus of these courses is on statistical methods for analysis of continuous random variables that follow explicit, well-defined models. In contrast, the nature of the variation that underlies most of the experiments we do as biologists is difficult to specify in detail. We will concentrate on a class of statistical methods, so-called nonparametric statistics, which requires us to make very few assumptions regarding the model that gives rise to the data. These methods are also attractive because they are usually simple to apply and have considerable intuitive appeal.

The first section of the book will review basic probability theory and general considerations for hypothesis testing. Both of these topics are covered in detail in most introductory statistics books. An excellent recent statistics text with a biological focus is "Biometry," by Robert A. Sokal and F. James Rohlf (fourth edition, Freeman, 2012).

For the material on nonparametric statistics, we recommend the following texts or monographs:

"Practical Nonparametric Statistics," by W. J. Conover (Wiley, third ed., 1999), provides an accessible and reasonably comprehensive text for this subject.

"Nonparametric Statistical Methods," by Myles Hollander, Douglas A. Wolfe, and Eric Chicken (Wiley, third edition, 2013) is more methodologically oriented, with explicit instructions on performing a variety of statistical tests and many examples, but is harder to follow on the theory of the tests. This book also has extensive tables for the statistics described in this course.

"Nonparametrics:  Statistical Methods based on Ranks," by E. L. Lehman (Prentice Hall, 1998), is a more advanced and theoretically oriented monograph. Much of the material presented below on extension of the methods to the complications of multiple comparisons, multiple experiments and confounding variables is developed from Lehman's treatment of these subjects.

"Categorical Data Analysis," by Alan Agresti (Wiley, 2002), provides a comprehensive treatment of its subject. The same author has also published "An Introduction to Categorical Data Analysis," (Wiley, 2007), which is a more accessible introduction to the field.

More detailed discussions and the original references for the statistical tests described in Chapters 5 through 8 can be found in either the Conover, Hollander et al., or Agresti texts. Nearly all of the methods described below can be performed using the Mstat program developed by one of us (ND). A copy of the User Manual for the most recent version (6.0) of the software is included as Appendix 7.

As noted above, much of the material in this book grew out of a course that I taught with Carter Denniston from 1994 to 2004. I enjoyed that decade of teaching with Carter enormously and learned a great deal from him about teaching well. I particularly miss both his dry sense of humor and knack for analogies that stick in students' minds. We lost Carter too soon when he died at the age of 67 in 2005. His friend and colleague, James F. Crow, provides a tribute to Carter in the foreword to this book.

I want to express my gratitude to many colleagues who provided critical advice, encouragement, and sample data over the years. In particular, to Bill Engels, who developed a predecessor to this course with me from 1984-1992, and to James F. Crow, Andrea Bilger, and Bill Sugden. Thanks are also due to the generations of students who made it very clear when I was, or wasn't, getting the point across.

A note on the second edition

This new edition contains some additional material on alternative approximations to the Wilcoxon rank sum distribution, the manual for a substantially revised version of Mstat, and a handful of typographical corrections.

I am saddened to report that Professor James F. Crow died on January 4, 2012, not long after he authored the tribute to Carter Denniston that appears as the foreword to this book. As noted by Daniel Hartl (Genetics, 190:1-2, 2012), he was both a remarkable geneticist and a remarkable man.

Norman Drinkwater
December, 2013

Foreword

Carter Denniston, a Tribute

The original intent for this book was that Norman Drinkwater and Carter Denniston be co-authors. Unfortunately, this was prevented by Denniston's premature death. Carter was greatly influential in the development of the book and it is appropriate that his contribution be appreciated and that he be remembered.

Carter Denniston was born in Milwaukee in 1938. I first met him when he was principal violist in the Wisconsin University orchestra. Yes, he was a gifted musician and went through the University on a music scholarship. He majored in anthropology where his teaching skills, evident throughout his academic life, were first appreciated. He won his first teaching award as a graduate assistant in anthropology. Later he changed his major to genetics, became my student, and received his Ph.D. in 1968. Two years later he joined the Genetics faculty where he spent the rest of his life.

It was as a graduate student that Carter's ability in logical analysis became apparent. He came under the influence of Charles Cotterman, from whom he acquired his taste for finite mathematics and combinatorics. His work was complex and difficult. He extended Cotterman's K coefficients to multiple alleles and multiple loci with linkage, a major feat. Although he was formally my student his major influence was Cotterman and he retained his interests in this area for the rest of his life.

Carter liked to teach, did it well, and did far more than his share. He also was active in committee work, both in the University and nationally. He collaborated in research projects, often as a statistical consultant, with researchers who appreciated his thoroughness, his logical rigor, and his depth of scholarship.

Carter and his wife, Glenda, enjoyed outdoor wilderness-type activities. As graduate students they jointly studied the Alaskan Inuits. For several years they volunteered to participate in an ophthalmological team in the Philippines. It is evident that Carter's interests and activities were broad and varied and that he had a social conscience.

He loved number theory and had looked forward to retirement as an opportunity to engage in this on a full-time basis. Alas it was not to be. Shortly after retiring, he was struck down by cancer and died on September 27, 2005. He never had a chance to pursue his favorite subject.

James F. Crow
November, 2011