Data Science for Economists
2026-03-01
By the end of today you should be able to:
It has become increasingly affordable to store and process vast quantities of digital text, triggering an explosion of empirical research that leverages text as data.
Historical cost of computer memory and storage
Finance — Tetlock (2007): pessimism in the WSJ “Abreast of the Market” column predicts next-day stock market declines and subsequent reversals
Labour — Hershbein & Kahn (2018): job posting text shows skill requirements rose faster in cities hit hardest by the Great Recession
Political economy — Gentzkow & Shapiro (2010): compare newspaper text to congressional speech to measure media slant; find strong demand-side pressure from readers
Macroeconomics — Baker, Bloom & Davis (2016): newspaper keyword counts measure Economic Policy Uncertainty \(\rightarrow\) Application 1
Macroeconomics / finance — Bybee, Kelly, Manela & Xiu (2024): topic model applied to WSJ articles tracks business-cycle themes in real time
Public finance / surveys — Ferrario & Stantcheva (2022): open-ended survey responses reveal people’s first-order concerns about tax policy
Industrial Organisation — Hoberg & Phillips (2016): cosine similarity of 10-K product descriptions defines dynamic industry boundaries \(\rightarrow\) Application 2
Google Trends: Ukraine
Google Trends: US abortion