Data Science for Economists
2026-03-01
Event: 13 March 2024 – European Parliament adopts the EU Artificial Intelligence Act.
Google (AI deployer)
Stock rose – less strict regulation than expected?
Nvidia (AI infrastructure)
Stock fell – signal of weaker AI demand ahead?
Why do two companies involved in AI move in opposite directions?
How can we tell if an event truly changed something?
Our toolkit:
We start from time. Then move to causality.
Why events in time matter
Event data is any data that you want to measure about an event
lubridate).Parsing, aligning, and manipulating timestamps
\[ \boxed{\;\text{datum} = (\text{ID},\, \textcolor{#e5b567}{t},\, \textcolor{#9e86c8}{\text{attributes}})\;} \]
From now on every method we use must respect this ordering.
Year --- Quarter --- Month --- Day --- Hour --- Minute --- Second --- Tick
| |
coarsest finest
Date (days) and POSIXct/POSIXlt (date–times, seconds).lubridate package (tidyverse) provides a grammar for working with them.Equally spaced
Observations at fixed intervals (daily close, monthly GDP).
Irregular / event-driven
Observations arrive at uneven times (trades, tweets, sensor pings).
Use HF data only when theory needs sub-daily resolution, and always report how you filtered and aligned the raw feed.
zoo (Zeileis, Grothendieck)
rollmean(), rollapply()xts, quantmodlubridate (part of tidyverse)
Date/POSIX objectsyear(), month(), wday()dplyr, ggplot2Use zoo for time-series math. Use lubridate to parse, clean, and wrangle timestamps.
Isolating causal effects from time series
Event study is probably the oldest and simplest causal inference research design
Fama calls event studies a test of how quickly security prices reflect public information announcements (Fama 1991, p. 1576).
(\(\neq\) Marketing lit: assume market efficiency to measure the value of campaigns, …)
Treatment \(\longrightarrow\) Outcome
A DAG is a map of our model assumptions – not data.
Z
\(\swarrow\) \(\searrow\)
T \(\longrightarrow\) Y
Controlling for Z blocks the confounding path and helps isolate the causal effect.
Survey to SME: “roughly how much cash (e.g. in savings, checking) do you have access to without seeking further loans or money from family or friends to pay for your business?”
Bartik et al. (2020), The impact of COVID-19 on small business outcomes
Would those firms that went bankrupt, have gone bankrupt even without the pandemic?
Boeing stock plunges again after coronavirus bailout quest spooks investors
Would this happen even without the bailout? What does the red line tell you?
Check more about the bailout here.
library(tidyverse)
library(lubridate)
library(quantmod) # pulls Yahoo Finance data
# 1. 30 trading days around the vote
getSymbols("BA", from = "2020-01-01", to = "2020-04-26", src = "yahoo")
ba <- fortify.zoo(BA) |>
rename(date = Index) |>
mutate(date = as_date(date))
# 2. Mark the event
event <- ymd("2020-02-20") # vote day
window <- ba |> filter(between(date, event - days(30), event + days(60)))ggplot(window, aes(date, BA.Close)) +
geom_line(colour = "steelblue") +
geom_vline(xintercept = as.numeric(event), linetype = "dashed",
colour = "orange", linewidth = .6) +
geom_vline(xintercept = as.numeric(ymd("2020-03-10")), linetype = "dashed",
colour = "red", linewidth = .6) +
labs(
title = "Boeing around the Bailout",
y = "Close price (USD)",
caption = "Event 1: Bailout shock | Event 2: WHO declares COVID-19 a pandemic (11 Mar 2020)"
)On February 2nd 2022, Meta (FB) released that its global daily active users declined from the previous quarter for the first time, to 1.929 billion from 1.930 billion.
Event Identification:
Pick an estimation period
Pick an observation period
Use the data from the estimation period to estimate a model predicting stock returns in each period:
\[R = \alpha + \beta R_{M} + \epsilon \qquad \hat{R} = E[R \mid R_{M}]\]
library(tidyverse); library(lubridate)
sp500 <- read_csv("~/08-event-study/sp_500.csv")
meta <- read_csv("~/08-event-study/META.csv")
event <- ymd("2022-02-02")
# Create estimation data set
sp500 <- sp500 %>%
mutate(returnSP = (Open - lag(Close)) / Open,
Date = format(as.Date(Date), "%Y-%m-%d"))
meta <- meta %>%
mutate(returnM = (Open - lag(Close)) / Open,
Date = format(as.Date(Date), "%Y-%m-%d"))
est_data <- left_join(sp500, meta, by = "Date") %>%
select(Date, returnM, returnSP) %>%
filter(Date < event - days(4))
# And observation data
obs_data <- est_data %>%
filter(Date >= event - days(15) & Date <= event + days(7))# Estimate a model predicting stock price with market return
m <- lm(return_meta ~ return_sp_500, data = est_data)
# Get AR
obs_data <- obs_data %>%
# Using mean of estimation return
mutate(AR_mean = return_meta - mean(est_data$return_meta),
# Then comparing to market return
AR_market = return_meta - return_sp_500,
# Then using model fit with estimation data
risk_predict = predict(m, newdata = obs_data),
AR_risk = return_meta - risk_predict)
# Graph the results
ggplot(obs_data, aes(x = ymd(Date), y = AR_risk, group = 1)) +
geom_line(color = "steelblue") +
geom_vline(aes(xintercept = ymd(event)),
linetype = "dashed", color = "yellow") +
ylab("Abnormal Return") + xlab("Date")Meta (FB) global daily active users declined from the previous quarter for the first time, to 1.929 billion from 1.930 billion.
What we observe: META’s stock dropped sharply after Feb 2, 2022 – but the abnormal return lasted only 1–2 days.
Why? Efficient Markets Digest News Quickly
Key idea: Abnormal return captures the difference from expected return, not the full price level.
Abnormal return is short-lived because markets are fast. No new surprise, no new abnormal return.
\[ Y_t = \beta_0 + \beta_1 t + \beta_2 \text{After}_t + \beta_3 (t \times \text{After}_t) + \varepsilon_t \]
When to use it? Any intervention that keeps working over time: regulations, infrastructure, training programmes.
Serial correlation is inevitable – report HAC/Newey-West SEs.
Policy introduced mid-2010 to improve pre-hospital care for heart attack / stroke.