1  Introduction

This is a course in Modern Experimental Design. So what is experimental design, and why does it need to be modern?

Experimental design is often a neglected part of the curriculum for students in statistics and data science. The basics of the field were developed from the 1920s to the 1960s, so it is not new and exciting anymore; rather than being taught as an area of active statistical research, it is taught as a service course for psychologists, biologists, medical students, and others in the sciences who might actually conduct experiments of their own.

There the role of experimentation is clear. To determine whether this new treatment causes a reduction in tumor size, or whether providing cookies on the last day of class causes an increase in student ratings of professors, we need an experiment. After all, statisticians can repeat “correlation is not causation” better than anyone, and an experiment offers the best opportunity to prove a causal relationship.

That implies we’ll need to talk about causality to really understand experimental design, and so we will. We will apply causal principles to determine what an experiment can prove, and to choose the right experiment to test our research question.

Then consider the analysis of experimental data. To students who have taken a linear regression course (like mine), analyzing data from an experiment seems trivial: Just do a regression. Let \(Y\) be the response variable of interest and let \(X\) be the treatment variables we want to learn about, and interpret the coefficients as you would in any other regression. Indeed, many experimental design textbooks are full of this kind of analysis, except they usually call it ANOVA and present complicated tables of sums of squares, even though the model is ultimately ordinary linear regression.

As statisticians, we’re used regression problems like this, and know all the usual theory: ordinary least squares, its role as a minimum variance unbiased estimator, and so on. We know we can use different estimators, like penalized regression, if we need different statistical properties.

But this is experimental design. We get to design experiments. Rather than analyzing data that someone else has collected, we get to choose what data to collect and how to go about collecting it. And that is where the interesting work of experimental design lies. An experiment allows us to choose who gets which treatments and determine how to measure the response. Choosing who gets which treatments allows us to answer causal questions, yes. But less obviously, designing an experiment lets us choose its statistical properties before we collect the data, and that’s all about practical constraints, not just causality.

Traditional experimental design methods were developed from the 1920s to the 1960s or so, and focused on the kinds of experiments conducted in agriculture and industry. These experiments featured certain constraints:

These constraints reward experiments chosen to extract the most information from the least data, working around restrictions imposed by the physical setup. As we will see, there is much we can do as designers to choose treatment assignments that maximize the information we can gain, and there are many standard designs. These experiments still exist, of course, and are widely done in industry.

On the other hand, there are also what I’ll call “modern” experiments. Often these are experiments conducted on online platforms at massive scale, featuring different constraints:

Of course, there are many other kinds of experiments than industrial experiments and online experiments. There are experiments in medicine (clinical trials), in psychology, and in all branches of science; there are even policy experiments designed to inform public policy and legislation. They all have unique constraints, and sometimes very different statistical goals.

To do experimental design, then, is to understand the question to be answered well enough that you can choose the data to collect that will best answer it, and know the appropriate statistical analysis to reach the scientific question.