Summary
First, the type of variables - continuous, binary, categorical, etc. - determines the appropriate distance measure. For continuous variables, Euclidean distance or Manhattan distance can be used. For binary variables, Hamming distance is suitable. For categorical variables, categorical distance that counts the number of categories in which two observations differ can be used. Second, the scales of the variables matter. Standardization of variables may be required before calculating distance. This ensures that large scale variables do not dominate the distance calculation. Third, the relative importance of the variables in predicting treatment assignment should be considered. Propensity score matching techniques weight variables by their predictive importance. Similarly, distance metrics can weight variables differently depending on their predictive power. Fourth, the correlation between variables influences the choice of distance metric. Highly correlated variables contain redundant information, so a distance metric that penalizes correlated variables, like Mahalanobis distance, may be preferred. Fifth, the researcher's knowledge and theories about the causal relationship under study guide the selection of appropriate variables and hence the distance measure. Subject-matter expertise is required to make judgment calls. Sixth, the performance of different distance measures in reducing imbalance between treated and control groups should be compared. The distance measure that minimizes differences in covariate means and distributions between matched groups may be the most suitable. Seventh, the distance metric chosen depends on the availability of many good matches. With a large and diverse set of controls, varying distance thresholds or weighting variables differently may facilitate better quality matches for more treated units. In summary, choosing a distance metric for matching in causal inference depends on a combination of statistical considerations related to the data and variables, as well as theoretical and subject-matter expertise. Checking the performance of different metrics in reducing group imbalance is important for finding the optimal measure for the study at hand.
Matching methods for implementing matching have four key steps, including defining "closeness", implementing the method, assessing the quality of matched samples, and analyzing the outcome. When there are many control individuals, ratio matching allows for multiple good matches per treated individual.
Published By:
EA Stuart - Statistical science: a review journal of the Institute of …, 2010 - ncbi.nlm.nih.gov
Cited By:
0
Researchers studied NSW apprentice training program's effect through observational study, matching treated and control groups based on propensity scores. The estimated treatment effect closely replicated an experimental benchmark, though error was large when using a small sample.
Published By:
RH Dehejia, S Wahba - Review of Economics and statistics, 2002 - direct.mit.edu
Cited By:
0
Genetic algorithms that optimize balance metrics based on standardized test statistics outperform methods using descriptive statistics.Optimizing p-values recovered the benchmark;descriptive stats did not.
Published By:
JS Sekhon - Survey Research Center, University of California …, 2007 - jsekhon.com
Cited By:
0
Republicanism seeks virtue; terrorism studies analyzes groups. Social policy affects lives; quality governance and economics drive progress. Politics and faith shape global affairs; borders and wars impact stability.
Published By:
JS Sekhon - Annual Review of Political Science, 2009 - annualreviews.org
Cited By:
0
This article provides a practical guide on multivariate matching and its applications in observational studies. It discusses various matching tools and techniques, as well as different matching structures and software packages for implementation.
Published By:
PR Rosenbaum - Annual Review of Statistics and Its Application, 2020 - annualreviews.org
Cited By:
0
The article proposes Bayesian Additive Regression Trees to estimate causal effects.It detected nonlinear effects, unlike regression, in a study of child care participation on age-3 IQ.
Published By:
JL Hill - Journal of Computational and Graphical Statistics, 2011 - Taylor & Francis
Cited By:
0
There is no sample selection error when TEi is constant over i. Random sampling avoids sampling bias, but sample selection and bias still occur in ΔS.
Published By:
K Imai, G King, EA Stuart - … of the Royal Statistical Society Series …, 2008 - academic.oup.com
Cited By:
0
Coarsened Exact Matching (CEM) isa method for improving causal inferences by recoding variables to group indistinguishable values and pruning unmatched units. CEM possesses useful properties and yields treatment effect error bounds.
Published By:
SM Iacus, G King, G Porro - Political analysis, 2012 - cambridge.org
Cited By:
0
MatchIt is an R package that uses matching methods to improve causal inferences in observational studies; it implements exact, nearest neighbor, caliper, optimal, full, and subclassification matching and provides summaries of covariate balance to choose the best matching solution.
Published By:
EA Stuart, G King, K Imai, D Ho - Journal of statistical software, 2011 - dash.harvard.edu
Cited By:
0
A study compares job programs. Genetic matching, an algorithm that weighs factors, recovered experimental results showing matching achieves balance and estimates causal effects.
Published By:
A Diamond, JS Sekhon - Review of Economics and Statistics, 2013 - direct.mit.edu
Cited By:
0