Daniel E. Ho, Kosuke Imai,
Gary King, & Elizabeth Stuart talk with
ScienceWatch.com and answer a few questions about
this month's Fast Breaking Paper in the field of Social
Sciences, general.
Article Title: Matching as nonparametric
preprocessing for reducing model dependence in parametric
causal inference
Authors: Ho,
DE;Imai, K;King, G;Stuart, EA
Journal: POLIT ANAL
Volume: 15
Issue: 3
Page: 199-236
Year: SUM 2007
* Harvard Univ, Dept Govt, 1737 Cambridge St, Cambridge, MA
02138 USA.
(addresses have been truncated)
Why do you think your paper is highly
cited?
Researchers in many fields seeking to make causal inferences from
observational data are worried about the common practice of relying heavily
on statistical models with difficult-to-justify assumptions. Although
published works rarely include causal estimates from more than a few model
specifications, authors usually choose the estimates to present from
numerous trial runs readers never see.
Coauthor
Kosuke Imai
Coauthor
Gary King
Coauthor
Elizabeth Stuart
Given the often large variation in estimates across choices of control
variables, functional forms, and other modeling assumptions, how can
researchers ensure that the few estimates presented are accurate or
representative? How do readers know that publications are not merely
demonstrations that it is possible to find a specification that fits the
author's favorite hypothesis? Our article offers a way around this problem
of model dependence.
Does it describe a new discovery, methodology, or
synthesis of knowledge?
Easy-to-use matching methods have been recognized as offering one way to
avoid some statistical modeling assumptions, but crucial results in this
fast-growing literature have been grossly misinterpreted. We show how to
avoid these misinterpretations through a new unified approach to causal
inference.
Our approach makes it possible to preprocess data with matching methods
(such as with the open source software we offer and then to apply the
best model-based techniques researchers might have used anyway. The
combination produces more accurate and considerably less model-dependent
causal inferences. It is also easy to apply since it does not add much
to scholars' existing workflow.
Would you summarize the significance of your paper in
layman's terms?
Randomized experimentation is often the best way to learn how the world
works, and of course is widely used in many fields. Unfortunately,
randomized experimentation is difficult, infeasible, unethical, or illegal
in vast areas of academic terrain—especially in the social sciences
where individual human subjects or large-scale public policies are
involved. When randomization is not feasible, scholars turn to
observational data, such as government statistics, survey research,
continuous time monitoring, health records, and numerous other sources.
The lack of randomization makes learning how the world works in these
situations more difficult. Our paper helps researchers avoid the
assumptions used in drawing statistical inferences from observational data.
It offers a framework, a methodology, and software that implements the
results.
How did you become involved in this research, and were
there any problems along the way?
We began this research at the Institute for Quantitative Social Science at
Harvard University, where Daniel Ho (now Assistant Professor of Law at
Stanford), Kosuke Imai (now Assistant Professor of Politics at Princeton),
and Elizabeth Stuart (now Assistant Professor of Public Health at Johns
Hopkins), were graduate students. With diverse applications in a variety of
fields, but a common interest in advancing statistical methods, we joined
to tackle problems we had in our own research and which we saw commonly
among our colleagues.
Where do you see your research leading in the
future?
Our research has already led to a variety of related results. Imai, King,
and Stuart have extended our framework to show how to avoid
misinterpretations between experimentalists and observationalists about
causal inference in: "Misunderstandings among experimentalists and
observationalists about causal inference," Journal of the Royal
Statistical Society, Series A, 171, part 2: 481-502, 2008.
King, along with Stefano Iacus and Giuseppe Porro, have developed a method
of matching that is considerably easier to use and more powerful than
existing approaches in "Matching for causal inference without balance
checking," which is available as open source software
online.
Along the way, Ho and King, along with Lee Epstein and Jeff Segal,
published an application of our matching framework in "The Supreme Court
during crisis: how war affects only nonwar cases," New York University
Law Review 80 (1): 1-116, April, 2005. This article outlines the
causal effect of being at war on the degree to which the US Supreme Court
curtails civil rights and liberties.
Do you foresee any social or political implications for
your research?
Reducing model sensitivity invariably entails sticking closer to the data.
Yet by sticking closer to the data, certain important questions of social
science and policy may become very difficult to answer: e.g., what is the
causal effect of democracy? Does capital punishment deter crime? Did the
surge work? Our paper provides one way for researchers to easily verify how
much of a socially or politically important finding is driven by empirics
or faith. Navigating that boundary remains the fundamental challenge for
empirical policy-relevant research.
Daniel E. Ho
Assistant Professor of Law
and Robert E. Paradise Faculty Fellow for Excellence in Teaching and
Research
Stanford University
Stanford, CA,
USA Web
Kosuke Imai
Assistant Professor
Department of Politics
Princeton, NJ, USA Web
Gary King
David Florence Professor of Government
and Director, Institute for Quantitative Social Science
Harvard University
Cambridge, MA, USA Web
Elizabeth Stuart
Assistant Professor
Department of Mental Health
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
Baltimore, MD, USA Web
Keywords: causal inferences from observational data,
statistical models, model dependence, unified approach to causal
inference, model-based techniques, model-dependent causal inferences,
scholars' existing workflow, randomized experimentation, observational
data, government statistics, survey research, continuous time
monitoring, health records, lack of randomization, assumptions used in
drawing statistical inferences from observational data.