Title: | Residual Balancing Weights for Marginal Structural Models |
---|---|
Description: | Residual balancing is a robust method of constructing weights for marginal structural models, which can be used to estimate (a) the average treatment effect in a cross-sectional observational study, (b) controlled direct/mediator effects in causal mediation analysis, and (c) the effects of time-varying treatments in panel data (Zhou and Wodtke 2020 <doi:10.1017/pan.2020.2>). This package provides three functions, rbwPoint(), rbwMed(), and rbwPanel(), that produce residual balancing weights for estimating (a), (b), (c), respectively. |
Authors: | Xiang Zhou [cre], Derick da Silva Baum [aut] |
Maintainer: | Xiang Zhou <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.3.2 |
Built: | 2024-11-14 05:14:56 UTC |
Source: | https://github.com/xiangzhou09/rbw |
A dataset containing 15 variables on the campaign contributions of 16,265 zip codes to the 2004 and 2008 US presidential elections in addition to the demographic characteristics of each area (Urban and Niebler 2014; Fong, Hazlett, and Imai 2018).
advertisement
advertisement
A data frame with 16,265 rows and 15 columns:
zip code
the log transformed TotAds
the total number of political advertisements aired in the zip code
population size
percent of the population over 65
median household income
percent Hispanic
percent black
population density (people per sq mile)
percent college graduates
a dummy variable indicating whether it is possible to commute to the zip code from a competitive state
state FIPS code
campaign contributions (in thousands of dollars)
log population
log median income
Fong, Christian, Chad Hazlett, and Kosuke Imai. 2018. Covariate Balancing Propensity Score for a Continuous Treatment: Application to The Efficacy of Political Advertisements. The Annals of Applied Statistics 12(1):156-77.
Urban, Carly, and Sarah Niebler. 2014. Dollars on the Sidewalk: Should U.S. Presidential Candidates Advertise in Uncontested States? American Journal of Political Science 58(2):322-36.
A dataset containing 19 variables and 565 unit-week records on the campaign of 113 Democratic candidates in US Senate and Gubernatorial Elections from 2000 to 2006 (Blackwell 2013).
campaign_long
campaign_long
A data frame with 565 rows and 19 columns:
name of the Democratic candidate
whether the candidate went negative in a campaign-week, defined as whether more than 10% of the candidate's political advertising was negative
whether the candidate went negative in the previous campaign-week
length of the candidate's campaign (in weeks)
whether the candidate was an incumbent
Democratic share in the baseline polls
share of undecided voters in the baseline polls
type of office in contest. 0: governor; 1: senator
Democratic share of the two-party vote in the election
week in the campaign (in the final five weeks preceding the election)
year of the election
state of the election
Democratic share in the polls
Democratic share in the polls in the previous campaign-week
share of undecided voters in the polls
share of undecided voters in the polls in the previous campaign-week
the proportion of advertisements that were negative in a campaign-week
the proportion of advertisements that were negative in the previous campaign-week
candidate id
Blackwell, Matthew. 2013. A Framework for Dynamic Causal Inference in Political Science. American Journal of Political Science 57(2): 504-619.
A dataset containing 32 variables and 113 unit records from Blackwell (2013).
campaign_wide
campaign_wide
A data frame with 113 rows and 26 columns:
name of the Democratic candidate
length of the candidate's campaign (in weeks)
whether the candidate was an incumbent.
Democratic share in the baseline polls
share of undecided voters in the baseline polls
type of office in contest. 0: governor; 1: senator
Democratic share of the two-party vote in the election
year of the election
state of the election
candidate id
Democratic share in week 1 polls
Democratic share in week 2 polls
Democratic share in week 3 polls
Democratic share in week 4 polls
Democratic share in week 5 polls
whether the candidate went negative in week 1
whether the candidate went negative in week 2
whether the candidate went negative in week 3
whether the candidate went negative in week 4
whether the candidate went negative in week 5
the proportion of advertisements that were negative in week 1 polls
the proportion of advertisements that were negative in week 2 polls
the proportion of advertisements that were negative in week 3 polls
the proportion of advertisements that were negative in week 4 polls
the proportion of advertisements that were negative in week 5 polls
share of undecided voters in week 1 polls
share of undecided voters in week 2 polls
share of undecided voters in week 3 polls
share of undecided voters in week 4 polls
share of undecided voters in week 5 polls
the total number of campaign-weeks in which a candidate went negative
the average proportion of advertisements that were negative over the final five weeks of the campaign multiplied by ten
Blackwell, Matthew. 2013. A Framework for Dynamic Causal Inference in Political Science. American Journal of Political Science 57(2): 504-619.
eb2
is an adaptation of eb
that generates
minimum entropy weights subject to a set of balancing constraints. Using
the method of Lagrange multipliers, the dual problem is an unconstrained
optimization problem that can be solved using Newton's method. When a full
Newton step is excessive, an exact line search is used to find the best step
size.
eb2(C, M, Q, Z = rep(0, ncol(C)), max_iter = 200, tol = 1e-04, print_level = 1)
eb2(C, M, Q, Z = rep(0, ncol(C)), max_iter = 200, tol = 1e-04, print_level = 1)
C |
A constraint matrix where each column corresponds to a balancing constraint. |
M |
A vector of moment conditions to be met in the reweighted sample. Specifically,
in the reweighted sample, we should have |
Q |
A vector of base weights. |
Z |
A vector of Lagrange multipliers to be initialized. |
max_iter |
Maximum number of iterations for Newton's method in entropy minimization. |
tol |
Tolerance parameter used to determine convergence. Specifically, convergence is achieved if
|
print_level |
The level of printing:
|
A list containing the results from the algorithm.
W |
A vector of normalized minimum entropy weights. |
Z |
A vector of Lagrange multipliers. |
converged |
A logical indicator for convergence. |
maxdiff |
A scalar indicating the maximum deviation between the moments of the reweighted data and the target moments. |
A dataset containing 17 variables on the views of 1,273 US adults about their support for war against countries that were hypothetically developing nuclear weapons. The data include several variables on the country's features and respondents' demographic and attitudinal characteristics (Tomz and Weeks 2013; Zhou and Wodtke 2020).
peace
peace
A data frame with 1,273 rows and 17 columns:
number of adverse events respondents considered probable if the US did not engage in war
a dummy variable indicating whether the country had signed a military alliance with the US
a dummy variable indicating whether the country had high levels of trade with the US
an index measuring respondent's attitude toward militarism
an index measuring respondent's attitude toward internationalism
an index measuring respondent's identification with the Republican party
an index measuring respondent's attitude toward ethnocentrism
an index measuring respondent's attitude toward religiosity
a dummy variable indicating whether the respondent is male
a dummy variable indicating whether the respondent is white
respondent's age
respondent's education with categories ranging from high school or less to postgraduate degree
a dummy variable indicating whether the country was a democracy
a measure of support for war on a five-point scale
number of negative consequences anticipated if the US engaged in war
whether the respondent thought the operation would succeed. 0: less than 50-50 chance of working even in the short run; 1: efficacious only in the short run; 2: successful both in the short and long run
a dummy variable indicating whether respondents thought it would be morally wrong to strike the country
Tomz, Michael R., and Jessica L. P. Weeks. 2013. Public Opinion and the Democratic Peace. The American Political Science Review 107(4):849-65.
Zhou, Xiang, and Geoffrey T. Wodtke. 2020. Residual Balancing: A Method of Constructing Weights for Marginal Structural Models. Political Analysis 28(4):487-506.
rbwMed
is a function that produces residual balancing weights for estimating
controlled direct/mediator effects in causal mediation analysis. The user supplies
a (optional) set of baseline confounders and a list of model objects for the conditional
mean of each post-treatment confounder given the treatment and baseline confounders.
The weights can be used to fit marginal structural models for the joint effects of the
treatment and a mediator on an outcome of interest.
rbwMed( treatment, mediator, zmodels, data, baseline_x, interact = FALSE, base_weights, max_iter = 200, tol = 1e-04, print_level = 1 )
rbwMed( treatment, mediator, zmodels, data, baseline_x, interact = FALSE, base_weights, max_iter = 200, tol = 1e-04, print_level = 1 )
treatment |
A symbol or character string for the treatment variable in |
mediator |
A symbol or character string for the mediator variable in |
zmodels |
A list of fitted |
data |
A data frame containing all variables in the model. |
baseline_x |
(Optional) An expression for a set of baseline confounders stored in |
interact |
A logical variable indicating whether baseline and post-treatment covariates should be balanced against the treatment-mediator interaction term(s). |
base_weights |
(Optional) A vector of base weights (or its name). |
max_iter |
Maximum number of iterations for Newton's method in entropy minimization. |
tol |
Tolerance parameter used to determine convergence in entropy minimization.
See documentation for |
print_level |
The level of printing. See documentation for |
A list containing the results.
weights |
A vector of residual balancing weights. |
constraints |
A matrix of (linearly independent) residual balancing constraints |
eb_out |
Results from calling the |
call |
The matched call. |
# models for post-treatment confounders m1 <- lm(threatc ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) m2 <- lm(cost ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) m3 <- lm(successc ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) # residual balancing weights rbwMed_fit <- rbwMed(treatment = democ, mediator = immoral, zmodels = list(m1, m2, m3), interact = TRUE, baseline_x = c(ally, trade, h1, i1, p1, e1, r1, male, white, age, ed4), data = peace) # attach residual balancing weights to data peace$rbw_cde <- rbwMed_fit$weights # fit marginal structural model if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw_cde, data = peace) msm_rbwMed <- svyglm(strike ~ democ * immoral, design = rbw_design) summary(msm_rbwMed) }
# models for post-treatment confounders m1 <- lm(threatc ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) m2 <- lm(cost ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) m3 <- lm(successc ~ ally + trade + h1 + i1 + p1 + e1 + r1 + male + white + age + ed4 + democ, data = peace) # residual balancing weights rbwMed_fit <- rbwMed(treatment = democ, mediator = immoral, zmodels = list(m1, m2, m3), interact = TRUE, baseline_x = c(ally, trade, h1, i1, p1, e1, r1, male, white, age, ed4), data = peace) # attach residual balancing weights to data peace$rbw_cde <- rbwMed_fit$weights # fit marginal structural model if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw_cde, data = peace) msm_rbwMed <- svyglm(strike ~ democ * immoral, design = rbw_design) summary(msm_rbwMed) }
rbwPanel
is a function that produces residual balancing weights (rbw) for
estimating the marginal effects of time-varying treatments. The user supplies
a long format data frame (each row being a unit-period) and a list of
fitted model objects for the conditional mean of each post-treatment confounder given
past treatments and past confounders. The residuals of each time-varying confounder
are balanced across both the current treatment and the regressors of the confounder
model. In addition, when
future > 0
, the residuals are also balanced across future
treatments .
rbwPanel( treatment, xmodels, id, time, data, base_weights, future = 1L, max_iter = 200, tol = 1e-04, print_level = 1 )
rbwPanel( treatment, xmodels, id, time, data, base_weights, future = 1L, max_iter = 200, tol = 1e-04, print_level = 1 )
treatment |
A symbol or character string for the treatment variable in |
xmodels |
A list of fitted |
id |
A symbol or character string for the unit id variable in |
time |
A symbol or character string for the time variable in |
data |
A data frame containing all variables in the model. |
base_weights |
(Optional) A vector of base weights (or its name). |
future |
An integer indicating the number of future treatments in the balancing conditions. When
|
max_iter |
Maximum number of iterations for Newton's method in entropy minimization. |
tol |
Tolerance parameter used to determine convergence in entropy minimization.
See documentation for |
print_level |
The level of printing. See documentation for |
A list containing the results.
weights |
A data frame containing the unit id variable and residual balancing weights. |
constraints |
A matrix of (linearly independent) residual balancing constraints |
eb_out |
Results from calling the |
call |
The matched call. |
# models for time-varying confounders m1 <- lm(dem.polls ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week), data = campaign_long) m2 <- lm(undother ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week), data = campaign_long) xmodels <- list(m1, m2) # residual balancing weights rbwPanel_fit <- rbwPanel(treatment = d.gone.neg, xmodels = xmodels, id = id, time = week, data = campaign_long) summary(rbwPanel_fit$weights) # merge weights into wide-format data campaign_wide2 <- merge(campaign_wide, rbwPanel_fit$weights, by = "id") # fit a marginal structural model (adjusting for baseline confounders) if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw, data = campaign_wide2) msm_rbwPanel <- svyglm(demprcnt ~ cum_neg * deminc + camp.length + factor(year) + office, design = rbw_design) summary(msm_rbwPanel) }
# models for time-varying confounders m1 <- lm(dem.polls ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week), data = campaign_long) m2 <- lm(undother ~ (d.gone.neg.l1 + dem.polls.l1 + undother.l1) * factor(week), data = campaign_long) xmodels <- list(m1, m2) # residual balancing weights rbwPanel_fit <- rbwPanel(treatment = d.gone.neg, xmodels = xmodels, id = id, time = week, data = campaign_long) summary(rbwPanel_fit$weights) # merge weights into wide-format data campaign_wide2 <- merge(campaign_wide, rbwPanel_fit$weights, by = "id") # fit a marginal structural model (adjusting for baseline confounders) if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw, data = campaign_wide2) msm_rbwPanel <- svyglm(demprcnt ~ cum_neg * deminc + camp.length + factor(year) + office, design = rbw_design) summary(msm_rbwPanel) }
rbwPoint
is a function that produces residual balancing weights in a point treatment setting. It takes
a set of baseline confounders and computes the residuals for each confounder by centering it around
its sample mean. The weights can be used to fit marginal structural models to estimate the average treatment
effect (ATE).
rbwPoint( treatment, data, baseline_x, base_weights, max_iter = 200, tol = 1e-04, print_level = 1 )
rbwPoint( treatment, data, baseline_x, base_weights, max_iter = 200, tol = 1e-04, print_level = 1 )
treatment |
A symbol or character string for the treatment variable in |
data |
A data frame containing all variables in the model. |
baseline_x |
An expression for a set of baseline confounders stored in |
base_weights |
(Optional) A vector of base weights (or its name). |
max_iter |
Maximum number of iterations for Newton's method in entropy minimization. |
tol |
Tolerance parameter used to determine convergence in entropy minimization.
See documentation for |
print_level |
The level of printing. See documentation for |
A list containing the results.
weights |
A vector of residual balancing weights. |
constraints |
A matrix of (linearly independent) residual balancing constraints |
eb_out |
Results from calling the |
call |
The matched call. |
# residual balancing weights rbwPoint_fit <- rbwPoint(treat, baseline_x = c(log_TotalPop, PercentOver65, log_Inc, PercentHispanic, PercentBlack, density, per_collegegrads, CanCommute), data = advertisement) # attach residual balancing weights to data advertisement$rbw_point <- rbwPoint_fit$weights # fit marginal structural model if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw_point, data = advertisement) # the outcome model includes the treatment, the square of the treatment, # and state-level fixed effects (Fong, Hazlett, and Imai 2018) msm_rbwPoint <- svyglm(Cont ~ treat + I(treat^2) + factor(StFIPS), design = rbw_design) summary(msm_rbwPoint) }
# residual balancing weights rbwPoint_fit <- rbwPoint(treat, baseline_x = c(log_TotalPop, PercentOver65, log_Inc, PercentHispanic, PercentBlack, density, per_collegegrads, CanCommute), data = advertisement) # attach residual balancing weights to data advertisement$rbw_point <- rbwPoint_fit$weights # fit marginal structural model if(require(survey)){ rbw_design <- svydesign(ids = ~ 1, weights = ~ rbw_point, data = advertisement) # the outcome model includes the treatment, the square of the treatment, # and state-level fixed effects (Fong, Hazlett, and Imai 2018) msm_rbwPoint <- svyglm(Cont ~ treat + I(treat^2) + factor(StFIPS), design = rbw_design) summary(msm_rbwPoint) }