Prediction vs. explanation in statistical model building
Otto Koppius, Ph.D.
Predictive modeling, where one tries to predict for instance the outcome of a particular process or the occurrence of a certain event, is common in many research areas. In medical research, predictive models (often called prognostic models) are for instance used to predict patient outcomes or the effect of certain treatments. A common practice is using the best explanatory statistical model for prognostic purposes, i.e. the model with significant coefficients for the independent variables. While common, this is incorrect for prognostic purposes, as the best explanatory model is almost always different from the best prognostic model for a number of reasons, which I will describe in this talk. Furthermore, the process of building a predictive model is fundamentally different from building an explanatory model, as differences occur in every step of the modeling process. Moreover, predictive models have additional roles to play alongside explanatory models in theory building and theory testing, such as new theory generation, measurement development, comparison of competing theories, improvement of the conceptual structure of existing models, relevance assessment, and assessment of the predictability of empirical phenomena. I will illustrate the differences between the modeling approaches with examples from the literature on adoption of new health technologies and from diffusion over networks.