fixes

R-CMD-check CRAN status

Overview

Note
The fixes package currently supports data with annual time intervals only.
For datasets with finer time intervals, such as monthly or quarterly data, I recommend creating a new column with sequential time numbers (e.g., 1, 2, 3, …) representing the time order.
This column can then be used for analysis.

The fixes package is designed for conducting analysis and creating plots for event studies, a method used to verify the parallel trends assumption in two-way fixed effects (TWFE) difference-in-differences (DID) analysis.

The package includes two main functions:

  1. run_es(): Accepts a data frame, generates lead and lag variables, and performs event study analysis. The function returns the results as a data frame.
  2. plot_es(): Creates plots using ggplot2 based on the data frame generated by run_es(). Users can choose between a plot with geom_ribbon() or geom_errorbar() to visualize the results.

Installation

You can install the package like so:

# install.packages("pak")
pak::pak("fixes")

or

install.packages("fixes")

If you want to install development version, please install from GitHub repository:

pak::pak("yo5uke/fixes")

How to use

First, load the library.

library(fixes)

Data frame

The data frame to be analyzed must include the following variables:

  1. A variable to identify individuals.
  2. A dummy variable indicating treated individuals (e.g., is_treated).
  3. A variable representing time (e.g., year).
  4. An outcome variable.

For example, a data frame like the following:

firm_id state_id year is_treated y
1 21 1980 1 0.8342158
1 21 1981 1 -0.5354355
1 21 1982 1 1.1372828
1 21 1983 1 0.7339165
1 21 1984 1 1.4232840
1 21 1985 1 1.2783362

run_es()

run_es() has nine arguments.

Argument Description
data Data frame to be used.
outcome Outcome variable. Can be specified as a raw variable or a transformation (e.g., log(y)). Provide it unquoted.
treatment Dummy variable indicating the treated units. Provide it unquoted.
time Time variable. Provide it unquoted.
timing Time value indicating when the treatment occurs.
lead_range Number of pre-treatment periods to include (e.g., 3 = lead3, lead2, lead1).
lag_range Number of post-treatment periods to include (e.g., 2 = lag0, lag1, lag2).
covariates Additional covariates to include in the regression. Must be a one-sided formula (e.g., ~ x1 + x2).
fe Fixed effects to control for unobserved heterogeneity. Must be a one-sided formula (e.g., ~ id + year).
cluster Specifies clustering for standard errors. Can be a character vector (e.g., c("id", "year")) or a formula (e.g., ~ id + year).
baseline Relative time value to be used as the reference category. The corresponding dummy is excluded from the regression. Must lie within the lead/lag range.
interval Time interval between observations (e.g., 1 for yearly data, 5 for 5-year intervals).

Then, perform the analysis as follows:

event_study <- run_es(
  data       = df, 
  outcome    = y, 
  treatment  = is_treated, 
  time       = year, 
  timing     = 1998, 
  lead_range = 5, 
  lag_range  = 5, 
  fe         = ~ firm_id + year, 
  cluster    = ~ state_id, 
  baseline   = -1, 
  interval   = 1
)

Note: The fe argument should be specified using additive notation (e.g., firm_id + year), while the cluster argument should be enclosed in double quotation marks.

By executing run_es(), the event study analysis results will be returned as a tidy data frame1.

You can use this data to create your own plots, but fixes also provides convenient plotting functions.

If you want to include covariates, please specify them as follows:

event_study <- run_es(
  data       = df, 
  outcome    = y, 
  treatment  = is_treated, 
  time       = year, 
  timing     = 1998, 
  lead_range = 5, 
  lag_range  = 5, 
  covariates = ~ cov1 + cov2 + cov3, 
  fe         = ~ firm_id + year, 
  cluster    = ~ state_id, 
  baseline   = -1, 
  interval   = 1
)

plot_es()

The plot_es() function creates a plot based on ggplot2.

plot_es() has 12 arguments.

Arguments Description
data Data frame created by run_es()
type The type of confidence interval visualization: “ribbon” (default) or “errorbar”
vline_val The x-intercept for the vertical reference line (default: 0)
vline_color Color for the vertical reference line (default: “#000”)
hline_val The y-intercept for the horizontal reference line (default: 0)
hline_color Color for the horizontal reference line (default: “#000”)
linewidth The width of the lines for the plot (default: 1)
pointsize The size of the points for the estimates (default: 2)
alpha The transparency level for ribbons (default: 0.2)
barwidth The width of the error bars (default: 0.2)
color The color for the lines and points (default: “#B25D91FF”)
fill The fill color for ribbons (default: “#B25D91FF”).

If you don’t care about the details, you can just pass the data frame created with run_es() and the plot will be complete.

plot_es(event_study)

plot_es(event_study, type = "errorbar")

plot_es(event_study, type = "errorbar", vline_val = -.5)

Since it is created on a ggplot2 basis, it is possible to modify minor details.

plot_es(event_study, type = "errorbar") + 
  ggplot2::scale_x_continuous(breaks = seq(-5, 5, by = 1)) + 
  ggplot2::ggtitle("Result of Event Study")

Debugging

If you find an issue, please report it on the GitHub Issues page.


  1. Behind the scenes, fixest::feols() is used for estimation.↩︎