In praise of exploratory statistics

There has been a lot of discussion of researcher degrees of freedom lately (e.g. Jeremy here or Andrew Gelman here – PS by my read Gelman got the specific example wrong because I think the authors really did have a genuine a priori hypothesis but the general point remains true and the specific example is revealing of how hard this is to sort out in the current research context).

I would argue that this problem comes about because people fail to be clear about their goals in using statistics (mostly the researchers, this is not a critique of Jeremy or Andrew’s posts). When I teach a 2nd semester graduate stats class, I teach that there are three distinct goals for which one might use statistics:

  1. Hypothesis testing
  2. Prediction
  3. Exploration

These three goals are all pretty much mutually exclusive (although there is some overlap between prediction and exploration). Hypothesis testing is of…

