Propensity Score

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 64941 Experts worldwide ranked by ideXlab platform

Peter C Austin - One of the best experts on this subject based on the ideXlab platform.

  • Propensity Score matching with competing risks in survival analysis
    Statistics in Medicine, 2019
    Co-Authors: Peter C Austin, Jason P Fine
    Abstract:

    Propensity-Score matching is a popular analytic method to remove the effects of confounding due to measured baseline covariates when using observational data to estimate the effects of treatment. Time-to-event outcomes are common in medical research. Competing risks are outcomes whose occurrence precludes the occurrence of the primary time-to-event outcome of interest. All non-fatal outcomes and all cause-specific mortality outcomes are potentially subject to competing risks. There is a paucity of guidance on the conduct of Propensity-Score matching in the presence of competing risks. We describe how both relative and absolute measures of treatment effect can be obtained when using Propensity-Score matching with competing risks data. Estimates of the relative effect of treatment can be obtained by using cause-specific hazard models in the matched sample. Estimates of absolute treatment effects can be obtained by comparing cumulative incidence functions (CIFs) between matched treated and matched control subjects. We conducted a series of Monte Carlo simulations to compare the empirical type I error rate of different statistical methods for testing the equality of CIFs estimated in the matched sample. We also examined the performance of different methods to estimate the marginal subdistribution hazard ratio. We recommend that a marginal subdistribution hazard model that accounts for the within-pair clustering of outcomes be used to test the equality of CIFs and to estimate subdistribution hazard ratios. We illustrate the described methods by using data on patients discharged from hospital with acute myocardial infarction to estimate the effect of discharge prescribing of statins on cardiovascular death.

  • Propensity Score matching and complex surveys
    Statistical Methods in Medical Research, 2018
    Co-Authors: Peter C Austin, Nathaniel Jembere, Maria Chiu
    Abstract:

    Researchers are increasingly using complex population-based sample surveys to estimate the effects of treatments, exposures and interventions. In such analyses, statistical methods are essential to minimize the effect of confounding due to measured covariates, as treated subjects frequently differ from control subjects. Methods based on the Propensity Score are increasingly popular. Minimal research has been conducted on how to implement Propensity Score matching when using data from complex sample surveys. We used Monte Carlo simulations to examine two critical issues when implementing Propensity Score matching with such data. First, we examined how the Propensity Score model should be formulated. We considered three different formulations depending on whether or not a weighted regression model was used to estimate the Propensity Score and whether or not the survey weights were included in the Propensity Score model as an additional covariate. Second, we examined whether matched control subjects should retain their natural survey weight or whether they should inherit the survey weight of the treated subject to which they were matched. Our results were inconclusive with respect to which method of estimating the Propensity Score model was preferable. In general, greater balance in measured baseline covariates and decreased bias was observed when natural retained weights were used compared to when inherited weights were used. We also demonstrated that bootstrap-based methods performed well for estimating the variance of treatment effects when outcomes are binary. We illustrated the application of our methods by using the Canadian Community Health Survey to estimate the effect of educational attainment on lifetime prevalence of mood or anxiety disorders.

  • comparing the performance of Propensity Score methods in healthcare database studies with rare outcomes
    Statistics in Medicine, 2017
    Co-Authors: Jessica M Franklin, Elizabeth A Stuart, Peter C Austin, Wesley Eddings, Sebastian Schneeweiss
    Abstract:

    Nonrandomized studies of treatments from electronic healthcare databases are critical for producing the evidence necessary to making informed treatment decisions, but often rely on comparing rates of events observed in a small number of patients. In addition, studies constructed from electronic healthcare databases, for example, administrative claims data, often adjust for many, possibly hundreds, of potential confounders. Despite the importance of maximizing efficiency when there are many confounders and few observed outcome events, there has been relatively little research on the relative performance of different Propensity Score methods in this context. In this paper, we compare a wide variety of Propensity-based estimators of the marginal relative risk. In contrast to prior research that has focused on specific statistical methods in isolation of other analytic choices, we instead consider a method to be defined by the complete multistep process from Propensity Score modeling to final treatment effect estimation. Propensity Score model estimation methods considered include ordinary logistic regression, Bayesian logistic regression, lasso, and boosted regression trees. Methods for utilizing the Propensity Score include pair matching, full matching, decile strata, fine strata, regression adjustment using one or two nonlinear splines, inverse Propensity weighting, and matching weights. We evaluate methods via a ‘plasmode’ simulation study, which creates simulated datasets on the basis of a real cohort study of two treatments constructed from administrative claims data. Our results suggest that regression adjustment and matching weights, regardless of the Propensity Score model estimation method, provide lower bias and mean squared error in the context of rare binary outcomes. Copyright © 2017 John Wiley & Sons, Ltd.

  • a review of Propensity Score methods and their use in cardiovascular research
    Canadian Journal of Cardiology, 2016
    Co-Authors: Peter C Austin, Jack V Tu, Dennis T Ko, David C Mazer, Alex Kiss, Stephen E Fremes
    Abstract:

    Observational studies using Propensity-Score methods have been increasing in the cardiovascular literature because randomized controlled trials are not always feasible or ethical. However, Propensity-Score methods can be confusing, and the general audience may not fully understand the importance of this technique. The objectives of this review are to describe (1) the fundamentals of Propensity Score methods, (2) the techniques to assess for Propensity-Score model adequacy, (3) the 4 major methods for using the Propensity Score (matching, stratification, covariate adjustment, and inverse probability of treatment weighting [IPTW]) using examples from previously published cardiovascular studies, and (4) the strengths and weaknesses of these 4 techniques. Our review suggests that matching or IPTW using the Propensity Score have shown to be most effective in reducing bias of the treatment effect.

  • the use of bootstrapping when using Propensity Score matching without replacement a simulation study
    Statistics in Medicine, 2014
    Co-Authors: Peter C Austin, Dylan S Small
    Abstract:

    Propensity-Score matching is frequently used to estimate the effect of treatments, exposures, and interventions when using observational data. An important issue when using Propensity-Score matching is how to estimate the standard error of the estimated treatment effect. Accurate variance estimation permits construction of confidence intervals that have the advertised coverage rates and tests of statistical significance that have the correct type I error rates. There is disagreement in the literature as to how standard errors should be estimated. The bootstrap is a commonly used resampling method that permits estimation of the sampling variability of estimated parameters. Bootstrap methods are rarely used in conjunction with Propensity-Score matching. We propose two different bootstrap methods for use when using Propensity-Score matching without replacementand examined their performance with a series of Monte Carlo simulations. The first method involved drawing bootstrap samples from the matched pairs in the Propensity-Score-matched sample. The second method involved drawing bootstrap samples from the original sample and estimating the Propensity Score separately in each bootstrap sample and creating a matched sample within each of these bootstrap samples. The former approach was found to result in estimates of the standard error that were closer to the empirical standard deviation of the sampling distribution of estimated effects.

Guido W Imbens - One of the best experts on this subject based on the ideXlab platform.

  • matching on the estimated Propensity Score
    Econometrica, 2016
    Co-Authors: Alberto Abadie, Guido W Imbens
    Abstract:

    Propensity Score matching estimators (Rosenbaum and Rubin (1983)) are widely used in evaluation research to estimate average treatment effects. In this article, we derive the large sample distribution of Propensity Score matching estimators. Our derivations take into account that the Propensity Score is itself estimated in a first step, prior to matching. We prove that first step estimation of the Propensity Score affects the large sample distribution of Propensity Score matching estimators, and derive adjustments to the large sample variances of Propensity Score matching estimators of the average treatment effect (ATE) and the average treatment effect on the treated (ATET). The adjustment for the ATE estimator is negative (or zero in some special cases), implying that matching on the estimated Propensity Score is more efficient than matching on the true Propensity Score in large samples. However, for the ATET estimator, the sign of the adjustment term depends on the data generating process, and ignoring the estimation error in the Propensity Score may lead to confidence intervals that are either too large or too small.

  • matching on the estimated Propensity Score
    Research Papers in Economics, 2009
    Co-Authors: Alberto Abadie, Guido W Imbens
    Abstract:

    Propensity Score matching estimators (Rosenbaum and Rubin, 1983) are widely used in evaluation research to estimate average treatment effects. In this article, we derive the large sample distribution of Propensity Score matching estimators. Our derivations take into account that the Propensity Score is itself estimated in a first step, prior to matching. We prove that first step estimation of the Propensity Score affects the large sample distribution of Propensity Score matching estimators. Moreover, we derive an adjustment to the large sample variance of Propensity Score matching estimators that corrects for first step estimation of the Propensity Score. In spite of the great popularity of Propensity Score matching estimators, these results were previously unavailable in the literature.

  • the Propensity Score with continuous treatments
    2005
    Co-Authors: Keisuke Hirano, Guido W Imbens
    Abstract:

    of the binary treatment Propensity Score, which we label the generalized Propensity Score (GPS). We demonstrate that the GPS has many of the attractive properties of the binary treatment Propensity Score. Just as in the binary treatment case, adjusting for this scalar function of the covariates removes all biases associated with dierences in the covariates. The GPS also has certain balancing properties that can be used to assess the adequacy of particular specications of the Score. We discuss estimation and inference in a parametric

  • efficient estimation of average treatment effects using the estimated Propensity Score
    Econometrica, 2003
    Co-Authors: Keisuke Hirano, Guido W Imbens, Geert Ridder
    Abstract:

    We are interested in estimating the average effect of a binary treatment on a scalar outcome. If assignment to the treatment is exogenous or unconfounded, that is, independent of the potential outcomes given covariates, biases associated with simple treatmentcontrol average comparisons can be removed by adjusting for differences in the covariates. Rosenbaum and Rubin (1983) show that adjusting solely for differences between treated and control units in the Propensity Score removes all biases associated with differences in covariates. Although adjusting for differences in the Propensity Score removes all the bias, this can come at the expense of efficiency, as shown by Hahn (1998), Heckman, Ichimura, and Todd (1998), and Robins, Mark, and Newey (1992). We show that weighting by the inverse of a nonparametric estimate of the Propensity Score, rather than the true Propensity Score, leads to an efficient estimate of the average treatment effect. We provide intuition for this result by showing that this estimator can be interpreted as an empirical likelihood estimator that efficiently incorporates the information about the Propensity Score.

  • the role of the Propensity Score in estimating dose response functions
    Biometrika, 2000
    Co-Authors: Guido W Imbens
    Abstract:

    SUMMARY Estimation of average treatment effects in observational studies often requires adjustment for differences in pre-treatment variables. If the number of pre-treatment variables is large, standard covariance adjustment methods are often inadequate. Rosenbaum & Rubin (1983) propose an alternative method for adjusting for pre-treatment variables for the binary treatment case based on the so-called Propensity Score. Here an extension of the Propensity Score methodology is proposed that allows for estimation of average casual effects with multi-valued treatments. Estimation of average treatment effects in observational studies often requires adjustment for differences in pre-treatment variables. If the number of pre-treatment variables is large and their distribution varies substantially with treatment status, standard adjustment methods such as covari- ance adjustment are often inadequate. Rosenbaum & Rubin (1983, 1984) propose an alternative method for adjusting for pre-treatment variables based on the Propensity Score, the conditional probability of receiving the treatment given pre-treatment variables. They demonstrate that adjusting solely for the Propensity Score removes all bias associated with differences in the pre- treatment variables. The Rosenbaum-Rubin proposals deal exclusively with binary-valued treat- ments. In many cases of interest, however, treatments take on more than two values. Here an extension of the Propensity Score methodology is proposed that allows for estimation of average causal effects with multi-valued treatments. The key insight is that for estimation of average causal effects it is not necessary to divide the population into subpopulations where causal comparisons are valid, as the Propensity Score does; it is sufficient to divide the population into subpopulations where average potential outcomes can be estimated.

Kosuke Imai - One of the best experts on this subject based on the ideXlab platform.

  • optimal covariate balancing conditions in Propensity Score estimation
    Journal of Business & Economic Statistics, 2021
    Co-Authors: Jianqing Fan, Kosuke Imai, Inbeom Lee, Han Liu, Yang Ning, Xiaolin Yang
    Abstract:

    Inverse probability of treatment weighting (IPTW) is a popular method for estimating the average treatment effect (ATE). However, empirical studies show that the IPTW estimators can be sensitive to the misspecification of the Propensity Score model. To address this problem, researchers have proposed to estimate Propensity Score by directly optimizing the balance of pre-treatment covariates. While these methods appear to empirically perform well, little is known about how the choice of balancing conditions affects their theoretical properties. To fill this gap, we first characterize the asymptotic bias and efficiency of the IPTW estimator based on the Covariate Balancing Propensity Score (CBPS) methodology under local model misspecification. Based on this analysis, we show how to optimally choose the covariate balancing functions and propose an optimal CBPS-based IPTW estimator. This estimator is doubly robust; it is consistent for the ATE if either the Propensity Score model or the outcome model is correct. In addition, the proposed estimator is locally semiparametric efficient when both models are correctly specified. To further relax the parametric assumptions, we extend our method by using a sieve estimation approach. We show that the resulting estimator is globally efficient under a set of much weaker assumptions and has a smaller asymptotic bias than the existing estimators. Finally, we evaluate the finite sample performance of the proposed estimators via simulation and empirical studies. An open-source software package is available for implementing the proposed methods.

  • robust estimation of causal effects via a high dimensional covariate balancing Propensity Score
    Biometrika, 2020
    Co-Authors: Yang Ning, Peng Sida, Kosuke Imai
    Abstract:

    SummaryWe propose a robust method to estimate the average treatment effects in observational studies when the number of potential confounders is possibly much greater than the sample size. Our method consists of three steps. We first use a class of penalized $M$-estimators for the Propensity Score and outcome models. We then calibrate the initial estimate of the Propensity Score by balancing a carefully selected subset of covariates that are predictive of the outcome. Finally, the estimated Propensity Score is used to construct the inverse probability weighting estimator. We prove that the proposed estimator, which we call the high-dimensional covariate balancing Propensity Score, has the sample boundedness property, is root-$n$ consistent, asymptotically normal, and semiparametrically efficient when the Propensity Score model is correctly specified and the outcome model is linear in covariates. More importantly, we show that our estimator remains root-$n$ consistent and asymptotically normal so long as either the Propensity Score model or the outcome model is correctly specified. We provide valid confidence intervals in both cases and further extend these results to the case where the outcome model is a generalized linear model. In simulation studies, we find that the proposed methodology often estimates the average treatment effect more accurately than existing methods. We also present an empirical application, in which we estimate the average causal effect of college attendance on adulthood political participation. An open-source software package is available for implementing the proposed methodology.

  • covariate balancing Propensity Score for a continuous treatment application to the efficacy of political advertisements
    The Annals of Applied Statistics, 2018
    Co-Authors: Christian Fong, Chad Hazlett, Kosuke Imai
    Abstract:

    Propensity Score matching and weighting are popular methods when estimating causal effects in observational studies. Beyond the assumption of unconfoundedness, however, these methods also require the model for the Propensity Score to be correctly specified. The recently proposed covariate balancing Propensity Score (CBPS) methodology increases the robustness to model misspecification by directly optimizing sample covariate balance between the treatment and control groups. In this paper, we extend the CBPS to a continuous treatment. We propose the covariate balancing generalized Propensity Score (CBGPS) methodology, which minimizes the association between covariates and the treatment. We develop both parametric and nonparametric approaches and show their superior performance over the standard maximum likelihood estimation in a simulation study. The CBGPS methodology is applied to an observational study, whose goal is to estimate the causal effects of political advertisements on campaign contributions. We also provide open-source software that implements the proposed methods.

  • covariate balancing Propensity Score
    Journal of The Royal Statistical Society Series B-statistical Methodology, 2014
    Co-Authors: Kosuke Imai, Marc Ratkovic
    Abstract:

    The Propensity Score plays a central role in a variety of causal inference settings. In particular, matching and weighting methods based on the estimated Propensity Score have become increasingly common in observational studies. Despite their popularity and theoretical appeal, the main practical difficulty of these methods is that the Propensity Score must be estimated. Researchers have found that slight misspecification of the Propensity Score model can result in substantial bias of estimated treatment effects. This workshop introduces a simple and yet powerful new methodology, covariate balancing Propensity Score (CBPS) estimation, which significantly improves the empirical performance of Propensity Score methods. The CBPS simultaneously optimizes the covariate balance and the prediction of treatment assignment by exploiting the dual characteristics of the Propensity Score as a covariate balancing Score and the conditional probability of treatment assignment. The CBPS is shown to dramatically improve the poor empirical performance of Propensity Score matching and weighting methods reported in the literature. In addition, the CBPS can be extended to a number of other important settings, including the estimation of the generalized Propensity Score for non-binary treatments, the generalization of experimental estimates to a target population, and causal inference in the longitudinal settings with marginal structural models. The open-source R package, CBPS, is available for implementing the proposed methods.

  • causal inference with general treatment regimes generalizing the Propensity Score
    Journal of the American Statistical Association, 2004
    Co-Authors: Kosuke Imai, David A Van Dyk
    Abstract:

    In this article we develop the theoretical properties of the Propensity function, which is a generalization of the Propensity Score of Rosenbaum and Rubin. Methods based on the Propensity Score have long been used for causal inference in observational studies; they are easy to use and can effectively reduce the bias caused by nonrandom treatment assignment. Although treatment regimes need not be binary in practice, the Propensity Score methods are generally confined to binary treatment scenarios. Two possible exceptions have been suggested for ordinal and categorical treatments. In this article we develop theory and methods that encompass all of these techniques and widen their applicability by allowing for arbitrary treatment regimes. We illustrate our Propensity function methods by applying them to two datasets; we estimate the effect of smoking on medical expenditure and the effect of schooling on wages. We also conduct simulation studies to investigate the performance of our methods.

Geoffrey M Anderson - One of the best experts on this subject based on the ideXlab platform.

  • a comparison of the ability of different Propensity Score models to balance measured variables between treated and untreated subjects a monte carlo study
    Statistics in Medicine, 2007
    Co-Authors: Peter C Austin, Paul Grootendorst, Geoffrey M Anderson
    Abstract:

    The Propensity Score—the probability of exposure to a specific treatment conditional on observed variables—is increasingly being used in observational studies. Creating strata in which subjects are matched on the Propensity Score allows one to balance measured variables between treated and untreated subjects. There is an ongoing controversy in the literature as to which variables to include in the Propensity Score model. Some advocate including those variables that predict treatment assignment, while others suggest including all variables potentially related to the outcome, and still others advocate including only variables that are associated with both treatment and outcome. We provide a case study of the association between drug exposure and mortality to show that including a variable that is related to treatment, but not outcome, does not improve balance and reduces the number of matched pairs available for analysis. In order to investigate this issue more comprehensively, we conducted a series of Monte Carlo simulations of the performance of Propensity Score models that contained variables related to treatment allocation, or variables that were confounders for the treatment–outcome pair, or variables related to outcome or all variables related to either outcome or treatment or neither. We compared the use of these different Propensity Scores models in matching and stratification in terms of the extent to which they balanced variables. We demonstrated that all Propensity Scores models balanced measured confounders between treated and untreated subjects in a Propensity-Score matched sample. However, including only the true confounders or the variables predictive of the outcome in the Propensity Score model resulted in a substantially larger number of matched pairs than did using the treatment-allocation model. Stratifying on the quintiles of any Propensity Score model resulted in residual imbalance between treated and untreated subjects in the upper and lower quintiles. Greater balance between treated and untreated subjects was obtained after matching on the Propensity Score than after stratifying on the quintiles of the Propensity Score. When a confounding variable was omitted from any of the Propensity Score models, then matching or stratifying on the Propensity Score resulted in residual imbalance in prognostically important variables between treated and untreated subjects. We considered four Propensity Score models for estimating treatment effects: the model that included only true confounders; the model that included all variables associated with the outcome; the model that included all measured variables; and the model that included all variables associated with treatment selection. Reduction in bias when estimating a null treatment effect was equivalent for all four Propensity Score models when Propensity Score matching was used. Reduction in bias was marginally greater for the first two Propensity Score models than for the last two Propensity Score models when stratification on the quintiles of the Propensity Score model was employed. Furthermore, omitting a confounding variable from the Propensity Score model resulted in biased estimation of the treatment effect. Finally, the mean squared error for estimating a null treatment effect was lower when either of the first two Propensity Scores was used compared to when either of the last two Propensity Score models was used. Copyright © 2006 John Wiley & Sons, Ltd.

  • conditioning on the Propensity Score can result in biased estimation of common measures of treatment effect a monte carlo study
    Statistics in Medicine, 2007
    Co-Authors: Peter C Austin, Paul Grootendorst, Sharonlise T Normand, Geoffrey M Anderson
    Abstract:

    Propensity Score methods are increasingly being used to estimate causal treatment effects in the medical literature. Conditioning on the Propensity Score results in unbiased estimation of the expected difference in observed responses to two treatments. The degree to which conditioning on the Propensity Score introduces bias into the estimation of the conditional odds ratio or conditional hazard ratio, which are frequently used as measures of treatment effect in observational studies, has not been extensively studied. We conducted Monte Carlo simulations to determine the degree to which Propensity Score matching, stratification on the quintiles of the Propensity Score, and covariate adjustment using the Propensity Score result in biased estimation of conditional odds ratios, hazard ratios, and rate ratios. We found that conditioning on the Propensity Score resulted in biased estimation of the true conditional odds ratio and the true conditional hazard ratio. In all scenarios examined, treatment effects were biased towards the null treatment effect. However, conditioning on the Propensity Score did not result in biased estimation of the true conditional rate ratio. In contrast, conventional regression methods allowed unbiased estimation of the true conditional treatment effect when all variables associated with the outcome were included in the regression model. The observed bias in Propensity Score methods is due to the fact that regression models allow one to estimate conditional treatment effects, whereas Propensity Score methods allow one to estimate marginal treatment effects. In several settings with non-linear treatment effects, marginal and conditional treatment effects do not coincide. Copyright © 2006 John Wiley & Sons, Ltd.

  • conditioning on the Propensity Score can result in biased estimation of common measures of treatment effect a monte carlo study
    Statistics in Medicine, 2007
    Co-Authors: Peter C Austin, Paul Grootendorst, Sharonlise T Normand, Geoffrey M Anderson
    Abstract:

    Propensity Score methods are increasingly being used to estimate causal treatment effects in the medical literature. Conditioning on the Propensity Score results in unbiased estimation of the expected difference in observed responses to two treatments. The degree to which conditioning on the Propensity Score introduces bias into the estimation of the conditional odds ratio or conditional hazard ratio, which are frequently used as measures of treatment effect in observational studies, has not been extensively studied. We conducted Monte Carlo simulations to determine the degree to which Propensity Score matching, stratification on the quintiles of the Propensity Score, and covariate adjustment using the Propensity Score result in biased estimation of conditional odds ratios, hazard ratios, and rate ratios. We found that conditioning on the Propensity Score resulted in biased estimation of the true conditional odds ratio and the true conditional hazard ratio. In all scenarios examined, treatment effects were biased towards the null treatment effect. However, conditioning on the Propensity Score did not result in biased estimation of the true conditional rate ratio. In contrast, conventional regression methods allowed unbiased estimation of the true conditional treatment effect when all variables associated with the outcome were included in the regression model. The observed bias in Propensity Score methods is due to the fact that regression models allow one to estimate conditional treatment effects, whereas Propensity Score methods allow one to estimate marginal treatment effects. In several settings with non-linear treatment effects, marginal and conditional treatment effects do not coincide.

  • a comparison of the ability of different Propensity Score models to balance measured variables between treated and untreated subjects a monte carlo study
    Statistics in Medicine, 2007
    Co-Authors: Peter C Austin, Paul Grootendorst, Geoffrey M Anderson
    Abstract:

    The Propensity Score--the probability of exposure to a specific treatment conditional on observed variables--is increasingly being used in observational studies. Creating strata in which subjects are matched on the Propensity Score allows one to balance measured variables between treated and untreated subjects. There is an ongoing controversy in the literature as to which variables to include in the Propensity Score model. Some advocate including those variables that predict treatment assignment, while others suggest including all variables potentially related to the outcome, and still others advocate including only variables that are associated with both treatment and outcome. We provide a case study of the association between drug exposure and mortality to show that including a variable that is related to treatment, but not outcome, does not improve balance and reduces the number of matched pairs available for analysis. In order to investigate this issue more comprehensively, we conducted a series of Monte Carlo simulations of the performance of Propensity Score models that contained variables related to treatment allocation, or variables that were confounders for the treatment-outcome pair, or variables related to outcome or all variables related to either outcome or treatment or neither. We compared the use of these different Propensity Scores models in matching and stratification in terms of the extent to which they balanced variables. We demonstrated that all Propensity Scores models balanced measured confounders between treated and untreated subjects in a Propensity-Score matched sample. However, including only the true confounders or the variables predictive of the outcome in the Propensity Score model resulted in a substantially larger number of matched pairs than did using the treatment-allocation model. Stratifying on the quintiles of any Propensity Score model resulted in residual imbalance between treated and untreated subjects in the upper and lower quintiles. Greater balance between treated and untreated subjects was obtained after matching on the Propensity Score than after stratifying on the quintiles of the Propensity Score. When a confounding variable was omitted from any of the Propensity Score models, then matching or stratifying on the Propensity Score resulted in residual imbalance in prognostically important variables between treated and untreated subjects. We considered four Propensity Score models for estimating treatment effects: the model that included only true confounders; the model that included all variables associated with the outcome; the model that included all measured variables; and the model that included all variables associated with treatment selection. Reduction in bias when estimating a null treatment effect was equivalent for all four Propensity Score models when Propensity Score matching was used. Reduction in bias was marginally greater for the first two Propensity Score models than for the last two Propensity Score models when stratification on the quintiles of the Propensity Score model was employed. Furthermore, omitting a confounding variable from the Propensity Score model resulted in biased estimation of the treatment effect. Finally, the mean squared error for estimating a null treatment effect was lower when either of the first two Propensity Scores was used compared to when either of the last two Propensity Score models was used.

Elizabeth A Stuart - One of the best experts on this subject based on the ideXlab platform.

  • using sensitivity analyses for unobserved confounding to address covariate measurement error in Propensity Score methods
    American Journal of Epidemiology, 2018
    Co-Authors: Kara E Rudolph, Elizabeth A Stuart
    Abstract:

    Propensity Score methods are a popular tool with which to control for confounding in observational data, but their bias-reduction properties-as well as internal validity, generally-are threatened by covariate measurement error. There are few easy-to-implement methods of correcting for such bias. In this paper, we describe and demonstrate how existing sensitivity analyses for unobserved confounding-Propensity Score calibration, VanderWeele and Arah's bias formulas, and Rosenbaum's sensitivity analysis-can be adapted to address this problem. In a simulation study, we examine the extent to which these sensitivity analyses can correct for several measurement error structures: classical, systematic differential, and heteroscedastic covariate measurement error. We then apply these approaches to address covariate measurement error in estimating the association between depression and weight gain in a cohort of adults in Baltimore, Maryland. We recommend the use of VanderWeele and Arah's bias formulas and Propensity Score calibration (assuming it is adapted appropriately for the measurement error structure), as both approaches perform well for a variety of Propensity Score estimators and measurement error structures.

  • comparing the performance of Propensity Score methods in healthcare database studies with rare outcomes
    Statistics in Medicine, 2017
    Co-Authors: Jessica M Franklin, Elizabeth A Stuart, Peter C Austin, Wesley Eddings, Sebastian Schneeweiss
    Abstract:

    Nonrandomized studies of treatments from electronic healthcare databases are critical for producing the evidence necessary to making informed treatment decisions, but often rely on comparing rates of events observed in a small number of patients. In addition, studies constructed from electronic healthcare databases, for example, administrative claims data, often adjust for many, possibly hundreds, of potential confounders. Despite the importance of maximizing efficiency when there are many confounders and few observed outcome events, there has been relatively little research on the relative performance of different Propensity Score methods in this context. In this paper, we compare a wide variety of Propensity-based estimators of the marginal relative risk. In contrast to prior research that has focused on specific statistical methods in isolation of other analytic choices, we instead consider a method to be defined by the complete multistep process from Propensity Score modeling to final treatment effect estimation. Propensity Score model estimation methods considered include ordinary logistic regression, Bayesian logistic regression, lasso, and boosted regression trees. Methods for utilizing the Propensity Score include pair matching, full matching, decile strata, fine strata, regression adjustment using one or two nonlinear splines, inverse Propensity weighting, and matching weights. We evaluate methods via a ‘plasmode’ simulation study, which creates simulated datasets on the basis of a real cohort study of two treatments constructed from administrative claims data. Our results suggest that regression adjustment and matching weights, regardless of the Propensity Score model estimation method, provide lower bias and mean squared error in the context of rare binary outcomes. Copyright © 2017 John Wiley & Sons, Ltd.

  • generalizing observational study results applying Propensity Score methods to complex surveys
    Health Services Research, 2014
    Co-Authors: Eva H Dugoff, Megan S Schuler, Elizabeth A Stuart
    Abstract:

    Causal inference—answering questions about the effect of a particular exposure or intervention—is often elusive in health services research. Administrative and survey data capture information on the standard course of treatment or experiences, which allows researchers to measure the effects of treatments or programs that cannot feasibly be evaluated with a randomized trial. In our motivating example, we cannot randomize the type of physician (general practitioner or a specialist) from whom an individual receives primary care, but we can use existing observational datasets to assess the effect of specialist versus generalist care on health care spending. Furthermore, complex survey data frequently yield nationally representative samples. The fundamental challenge in using these data for causal inference is addressing potential confounding while still retaining the representativeness of the data. Confounding occurs when there are variables that affect both whether an individual receives the intervention of interest as well as the outcome. Propensity Score methods are statistical methods used to address potential confounding in observational studies (Rosenbaum and Rubin 1983). Broadly, the goal of Propensity Score methods is to improve the comparability of treatment groups on observed characteristics, to reduce bias in the effect estimates. Primary Propensity Score methods include matching, weighting, and subclassification (Stuart 2010). Although Propensity Score methods help reduce confounding, they cannot fully “recreate” a randomized experiment—randomization ensures balance on both observed and unobserved variables, whereas Propensity Score methods only ensure balance on observed variables. While Propensity Score methods for observational studies in general have been well described, there are few guidelines regarding how to incorporate Propensity Score methods with complex survey data. Few researchers, with the exception of Zanutto (2006) and Zanutto, Lu, and Hornik (2005), have focused on the complexities of how to use Propensity Score methods with complex survey data and appropriately interpret the results. To assess current practice, we conducted a limited systematic review of the peer-reviewed literature to identify studies that used Propensity Score analysis and complex survey data in the health services field. For 2010 and 2011, we identified 28 articles in PubMed that contained the key word “Propensity Score” and related to complex surveys in health services research. These studies demonstrated a variety of methodological approaches and interpretations of effect estimates. Of the 28 studies, 16 (57 percent) did not incorporate the survey weights into the final analysis. Of these 16 papers, 6 incorrectly described their results as “nationally representative” or reflective of a “population-based” sample. Only one of these explicitly stated that not incorporating the survey weights “compromises external validity, such that outcomes are not generalizable to national figures” (McCrae et al. 2010). Seven (25 percent) of the 28 studies stated that they included the survey weights in the final outcome regression model. Five (18 percent) studies performed Propensity Score weighting and multiplied the Propensity Score weights and the survey weights, with varying approaches to interpreting the final results. This heterogeneity in the recent literature suggests a variety of approaches and some possible misunderstandings in how to appropriately apply and interpret results from Propensity Score methods with complex survey data. More broadly, failure to properly account for the complex survey design is a common analytic error. In a review of statistical (but not Propensity Score) methods used in studies involving three youth surveys, Bell and colleagues (2012) found that nearly 40 percent of reviewed studies did not adequately account for the complex survey design. Survey samples obtained using a complex survey design offer researchers the unique ability to estimate effects that are generalizable to the target population (often, the national population). As this is a major advantage of survey data, we primarily focus on Propensity Score methods that incorporate survey weights to retain the generalizability of the final effect estimates using survey design-based analysis. Statistical analyses that do not include the survey weights will not necessarily be generalizable to the target population if the survey weights are correlated with the independent or dependent variables and may be biased for estimating population effects (Pfeffermann 1993). We provide an illustration of this phenomenon in our simulation study below. Although formal statistical generalizability is not always a goal, as we observed in our review of published studies, a common error is to ignore survey weights when using Propensity Score methods yet describe results as applicable to the original target population, which can lead to misleading conclusions. When assessing the effect of a treatment on an outcome, there are two causal estimands commonly of interest in observational studies: the average treatment effect on the treated (ATT) and the average treatment effect (ATE). The ATT is the average effect for individuals who actually received the treatment. The ATE is the average effect for all individuals, treated and control. In our motivating example (Example 2 below), the ATT represents the comparison of health care spending between individuals who had a specialist as a primary provider (the “treatment” group) and spending for the same individuals, if instead they had a generalist (the “control” group). In contrast, the ATE represents the difference in health care spending if everyone in the sample had a specialist compared to if everyone had a generalist. In a randomized experiment, the ATT and ATE are equivalent because the treatment group is a random sample of the full sample; in an observational study the ATT and ATE are not necessarily the same. When using survey data that represent a target population, there is a further distinction in that it is possible to estimate both sample and population ATTs and ATEs. The Sample ATT and ATE (denoted SATT and SATE, respectively) are the corresponding ATEs in the unweighted survey sample. The Population ATT and ATE (denoted PATT and PATE) are the corresponding estimands for the survey’s target population, accounting for the sampling design. See Imai, King, and Stuart (2008) for a more technical discussion of these estimands. The objective of this paper was to provide a tutorial for appropriate use of Propensity Score methods with complex survey data. We first assess the performance of various methods for combining Propensity Score methods and survey weights using a simple simulation. We then present results from the Medical Expenditure Panel Survey (MEPS), estimating health care spending among adults who report a generalist versus a specialist as their usual source of care. We highlight relevant interpretations of various analytic approaches and offer a set of guidelines for researchers to select the most appropriate Propensity Score methods for their study, given their desired estimand.

  • weight trimming and Propensity Score weighting
    PLOS ONE, 2011
    Co-Authors: Brian K. Lee, Justin Lessler, Elizabeth A Stuart
    Abstract:

    Propensity Score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve the performance of Propensity Score weighting and whether the benefits of trimming differ by Propensity Score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate Propensity Score weights. Results indicate that although misspecified logistic regression Propensity Score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated Propensity Scores. While trimming may be used to optimize Propensity Score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of Propensity Score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the Propensity Score model) rather than relying on ad-hoc methods such as weight trimming.

  • Improving Propensity Score weighting using machine learning
    Statistics in Medicine, 2010
    Co-Authors: Brian K. Lee, Justin Lessler, Elizabeth A Stuart
    Abstract:

    Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of Propensity Scores. The authors examined the performance of various CART-based Propensity Score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and 10 covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity Score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, per cent absolute bias, and 95 per cent confidence interval (CI) coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, whereas ensemble methods provided substantially better bias reduction and more consistent 95 per cent CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for Propensity Score weighting.