jwdid: Flexible Estimation of staggered DID designsFebruary 17, 2025
In this framework, the TE for \(i\) is defined as:
\[\theta_i = y_{i,1}(1) - y_{i,1}(0)\]
Which may not be useful. Instead, we settle focus on something different. “average treatment effect” on treated (ATET or ATT)
\[\begin{aligned} ATT &= E[\theta_i | D_i = 1] = E[y_{i,1}(1) - y_{i,1}(0) | D_i = 1] \\ &=E_1[\theta_i] = E_1[y_{i,1}(1)] - E_1[ y_{i,1}(0) ] \end{aligned} \]
But is still not identified, because \(E_1[y_{i,1}(0)]\) is not observed.
How to calculate \(E(y_{i,1}(0)|D=1)\) ?
First, Decompose it: \[E_1(y_{i,1}(0))=E_1(y_{i,0}(0)) + E_1(\lambda_i)\]
Under No anticipation: \(E_1(y_{i,0}(0))=E_1(y_{i,0}(1))=E_1(y_{i,0})\)
Under Parallel trends: \(E_1(\lambda_i)=E_0(\lambda_i)=E_0(y_{i,1}-y_{i,0})\)
Thus putting it all together:
\[\begin{aligned} ATT &= E_1[y_{i,1}] - [ E_1(y_{i,0}) + (E_0[y_{i,1}]-E_0[y_{i,0}]) ] \\ &= [E_1[y_{i,1}] - E_1(y_{i,0})] - [ E_0[y_{i,1}]-E_0[y_{i,0}] ] \end{aligned} \]
Setting aside estimation of Standard errors, the ATT can be estimated by simply comparing average outcomes for the treated and control groups before and after treatment.
or using the following regression:
\[y_{it} = \alpha + \beta D_i + \gamma t + \theta (D_i \times t) + \epsilon_{it} \]
Where \(\theta\) is the ATT.
Answering some of these tough questions!
\[y_{it} = \gamma t + \theta (D_i \times t) + \beta_i + \epsilon_{it} \]
\[y_{it} = \theta (W_{it}) + \beta_i + \gamma_t + \epsilon_{it} \]
\[y_{it} = \theta (W_{it}) + \beta_i + \gamma_t + \delta X_i + \epsilon_{it}\]
\[\begin{aligned} \text{PTA}&: E_1(y_{i,1}-y_{i,0})=E_0(y_{i,1}-y_{i,0}) \\ \text{CPTA}&: E_1(y_{i,1}-y_{i,0}|X)=E_0(y_{i,1}-y_{i,0}|X) \end{aligned} \]
\[ATT(X) = E_1[y_{i,1}|X] - E_1[y_{i,0}|X] - [E_0[y_{i,1}|X] - E_0[y_{i,0}|X]] \]
\[\begin{aligned} y_{it} &= \alpha &&+ \beta D &&+ \gamma t &&+ \theta (D \times t) \\ &+ \lambda X &&+ \lambda_D X \times D &&+ \lambda_D X \times t &&+ \lambda_{DT} \color{blue}{\tilde X} (D \times t) \\\ &+ \epsilon_{it} \end{aligned} \]
csdid[2] forces you to use time fixed characteristics with panel data, But not for RC datajwdid does not impose this restriction.\[y_{it} = \beta_i + \gamma_t + \theta W_{it} + \epsilon_{it}\]
did2s and did_imputation)csdid[2])jwdid)jwdid)Wrong:
\[y_{it} = \beta_i + \lambda_t + \theta (W_{it}) + \epsilon_{it}\]
Right:
\[y_{it} = \beta_i + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
\[y_{it} = \beta_i + \lambda_t + \sum_{g=g_0}^G \sum_{t=t_0}^{t=g-2} \theta_{gt} \mathbb{1}(g, t) + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
csdid[2] with out controls and Balance panel\[y_{it} = \beta_g + \lambda_t + \color{red}{\sum_{g=g_0}^G \sum_{t=0}^{t=g-2} \theta_{gt} \mathbb{1}(g, t)} + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \epsilon_{it}\]
jwdid may produce too many ATT(G,T) to analyze \(\rightarrow\) aggregate!
Simple: \(ATT = \frac{\sum_g \sum_t \theta_{gt} \mathbb{\omega}(g, t) \mathbb{1}(t\geq g)}{\sum_g \sum_t \mathbb{\omega}(g, t) \mathbb{1}(t\geq g)}\) estat simple
Group: \(ATT(g) = \frac{\sum_{t} \theta_{gt} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}{\sum_{t} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}\) estat group
Time: \(ATT(t) = \frac{\sum_{g} \theta_{gt} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}{\sum_{g} \mathbb{\omega}(g, t)\mathbb{1}(t\geq g)}\) estat calendar
Event: \(ATT(e) =\frac{\sum_g \sum_t \theta_{gt} \mathbb{\omega}(g, t) \mathbb{1}(t-g=e) }{\sum_g \sum_t \mathbb{\omega}(g, t)\mathbb{1}(t-g=e)}\) estat event
where \(\mathbb{\omega}(g, t)\) is the weight (total number of units in group \(g\) observed at time \(t\))
Allowing for \(X\) heterogeneity is simple. Simply consider a flexible model with interactions! \[\begin{aligned} y_{it} &= \color{red}{\beta_0} &&+ \beta_i &&+ \lambda_t &&+ \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) &&+ \\ &\ \ \ \ \delta X_{it} &&+ \sum_g \delta_g X_{it} \mathbb{1}(g) &&+ \sum_t \delta_t X_{it} \mathbb{1}(t) &&+ \sum_{g=g_0}^G \sum_{t=g}^{t=T} \delta_{gt} \color{blue}{X_{it}} \mathbb{1}(g, t) &&+\\ &\ \ \ \ \epsilon_{it} \end{aligned} \]
Considerations:
In Stata, jwdid can also estimate both types of models:
* Demeaning Data (Θ is ATT(G,T))
jwdid y x1 i.x2, tvar(tvar) ivar(ivar) gvar(gvar) [never]
* Using X as is (Θ is not ATT(G,T))
* May be faster
jwdid y x1 i.x2, tvar(tvar) ivar(ivar) gvar(gvar) [never] xasisxasis?\[\begin{aligned} \theta_{gt} &= E[\hat y_{i,t}(X_{it},\mathbb{1}(g,t)=1) - \hat y_{i,t}(X_{it},\mathbb{1}(g,t)=0)|g,t] \end{aligned} \]
There are few changes to consider:
\[\begin{aligned} y_{it}^* &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) \\ E(y_{it}) &= G(y_{it}^*) \end{aligned} \]
Thus we need a model appropriate for \(G()\) (logit, poisson, tobit, etc)
jwdidjwdid can estimate these type of models simply by using method().
jwdid will use cohort FE instead of individual FEppmlhdfe (so far) would still add individual fixed effects.cre option.
jwdidNote that \(\theta_{gt}\) is not ATT(G,T) on outcome, but on latent variable.
however, one can request aggregations of the latent or outcome variable afterwards
jwdid already incorporates this by using flexible specifications, but produces Average ATT’s.margins, it is possible to estimate ATTs for different discrete sub-groups (not continuous):** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) never
** ATTs for specific groups
estat simple // average ATT
estat simple, over(x1) // ATT estimated for each group of x1
// For observations where x2 is between 0 and 1
estat [simple|calendar|group|event], ores( x2>0 | x2<1 )
// For observations where x1 is 0
estat [simple|calendar|group|event], ores( x1==0 )jwdid framework can be adapted to these scenarios, albeit with limitations.trtvar that defines treatment intensity.** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) trtvar(trtvar) [never]
** ATT aggregation
*** Estimates Treatment effect, assuming Full Intensity (T=1)
estat [simple|group|calendar|event]
*** Estimates Treatment effect, assuming intensity as observed
estat [simple|group|calendar|event] , asis ores() or over() to estimate ATTs for different sub-groups.\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \sum_h \theta_{gth} \mathbb{1}(g, t, h) + \epsilon_{it} \end{aligned} \]
Estimation with jwdid is very simple:
** setup
jwdid y i.x1 x2, tvar(tvar) ivar(ivar) gvar(gvar) xattvar( trt_l trt_h) [never]
** Assume trt_m is the base treatment. trt_l and trt_h are potential treatments.
** The base-line treatment will be dropped.estat [simple|group|calendar|event] provide a single ATTover() or ores(), one could estimate ATT Heterogeneityjwdid is that we do not need to be concerned with the estimation of standard errors.
vce(unconditional) to aggregation commands
estat [simple|group|calendar|event], vce(unconditional)reghdfe or ppmlhdfe as the estimation method.
regress cre or poisson cre instead (Stata does this)jwdid has other “advanced” options that could further help model specification.
fevar(): Allows introducing FE other than Panel (only with reghdfe or ppmlhdfe)
\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \omega_j+ \epsilon_{it} \end{aligned} \]
exovar(): Variables not interacted with treatment \(G\) nor time \(T\), nor both.\[\begin{aligned} y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) + \phi X_{it} + \epsilon_{it} \end{aligned} \]
xtvar() and xgvar(): variables that will only interact the time or group fixed effects. \[\begin{aligned}
y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) +
\sum_{g=g_0}^G \gamma_{g} \mathbb{1}(g) X_{it} + \epsilon_{it} \\
y_{it} &= \beta_g + \lambda_t + \sum_{g=g_0}^G \sum_{t=g}^{t=T} \theta_{gt} \mathbb{1}(g, t) +
\sum_{t=t_0}^T \gamma_{t} \mathbb{1}(g) X_{it} + \epsilon_{it}
\end{aligned}
\]
anticipation(#): Allows to set a different period as baseline for the treatment. (default is 1) (g-1)
hettype(): Allows to impose some restrictions on heterogeneity type. Default its timecohort
time, cohort, event, twfe, eventcohortAnd For Event aggregation:
window(#1 #2) as is windowcwindow(#1 #2) Censored windowcsdid, did_imputation, did_multiple_dyn and jwdidjwdid to be as flexible as possible
If you are interested, you can install the latest version of jwdid using
net install jwdid, from(https://raw.githubusercontent.com/friosavila/stpackages/main)
You can find me on
Oceania Stata Conference 2025