Archive ScienceWatch

Daniel E. Ho, Kosuke Imai, Gary King, & Elizabeth Stuart talk with and answer a few questions about this month's Fast Breaking Paper in the field of Social Sciences, general.
Daniel E. Ho Article Title: Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference
Authors: Ho, DE;Imai, K;King, G;Stuart, EA
Volume: 15
Issue: 3
Page: 199-236
Year: SUM 2007
* Harvard Univ, Dept Govt, 1737 Cambridge St, Cambridge, MA 02138 USA.
(addresses have been truncated)

Why do you think your paper is highly cited?

Researchers in many fields seeking to make causal inferences from observational data are worried about the common practice of relying heavily on statistical models with difficult-to-justify assumptions. Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the estimates to present from numerous trial runs readers never see.

Kosuke Imai
Kosuke Imai

Gary King
Gary King

Elizabeth Stuart
Elizabeth Stuart

Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author's favorite hypothesis? Our article offers a way around this problem of model dependence.

Does it describe a new discovery, methodology, or synthesis of knowledge?

Easy-to-use matching methods have been recognized as offering one way to avoid some statistical modeling assumptions, but crucial results in this fast-growing literature have been grossly misinterpreted. We show how to avoid these misinterpretations through a new unified approach to causal inference.

Our approach makes it possible to preprocess data with matching methods (such as with the open source software we offer and then to apply the best model-based techniques researchers might have used anyway. The combination produces more accurate and considerably less model-dependent causal inferences. It is also easy to apply since it does not add much to scholars' existing workflow.

Would you summarize the significance of your paper in layman's terms?

Randomized experimentation is often the best way to learn how the world works, and of course is widely used in many fields. Unfortunately, randomized experimentation is difficult, infeasible, unethical, or illegal in vast areas of academic terrain—especially in the social sciences where individual human subjects or large-scale public policies are involved. When randomization is not feasible, scholars turn to observational data, such as government statistics, survey research, continuous time monitoring, health records, and numerous other sources.

The lack of randomization makes learning how the world works in these situations more difficult. Our paper helps researchers avoid the assumptions used in drawing statistical inferences from observational data. It offers a framework, a methodology, and software that implements the results.

How did you become involved in this research, and were there any problems along the way?

We began this research at the Institute for Quantitative Social Science at Harvard University, where Daniel Ho (now Assistant Professor of Law at Stanford), Kosuke Imai (now Assistant Professor of Politics at Princeton), and Elizabeth Stuart (now Assistant Professor of Public Health at Johns Hopkins), were graduate students. With diverse applications in a variety of fields, but a common interest in advancing statistical methods, we joined to tackle problems we had in our own research and which we saw commonly among our colleagues.

Where do you see your research leading in the future?

Our research has already led to a variety of related results. Imai, King, and Stuart have extended our framework to show how to avoid misinterpretations between experimentalists and observationalists about causal inference in: "Misunderstandings among experimentalists and observationalists about causal inference," Journal of the Royal Statistical Society, Series A, 171, part 2: 481-502, 2008.

King, along with Stefano Iacus and Giuseppe Porro, have developed a method of matching that is considerably easier to use and more powerful than existing approaches in "Matching for causal inference without balance checking," which is available as open source software online.

Along the way, Ho and King, along with Lee Epstein and Jeff Segal, published an application of our matching framework in "The Supreme Court during crisis: how war affects only nonwar cases," New York University Law Review 80 (1): 1-116, April, 2005. This article outlines the causal effect of being at war on the degree to which the US Supreme Court curtails civil rights and liberties.

Do you foresee any social or political implications for your research?

Reducing model sensitivity invariably entails sticking closer to the data. Yet by sticking closer to the data, certain important questions of social science and policy may become very difficult to answer: e.g., what is the causal effect of democracy? Does capital punishment deter crime? Did the surge work? Our paper provides one way for researchers to easily verify how much of a socially or politically important finding is driven by empirics or faith. Navigating that boundary remains the fundamental challenge for empirical policy-relevant research.

Daniel E. Ho
Assistant Professor of Law
and Robert E. Paradise Faculty Fellow for Excellence in Teaching and Research
Stanford University
Stanford, CA, USA

Kosuke Imai
Assistant Professor
Department of Politics
Princeton, NJ, USA

Gary King
David Florence Professor of Government
and Director, Institute for Quantitative Social Science
Harvard University
Cambridge, MA, USA


Elizabeth Stuart
Assistant Professor
Department of Mental Health
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
Baltimore, MD, USA


Keywords: causal inferences from observational data, statistical models, model dependence, unified approach to causal inference, model-based techniques, model-dependent causal inferences, scholars' existing workflow, randomized experimentation, observational data, government statistics, survey research, continuous time monitoring, health records, lack of randomization, assumptions used in drawing statistical inferences from observational data.

Download this article

2008 : October 2008 - Fast Breaking Papers : Daniel E. Ho, Kosuke Imai, Gary King, & Elizabeth Stuart