Data Science for Economists
2026-03-01
AI and Your Toolkit
AI coding agents are powerful — but they are not magic.
Claude Code and Codex are terminal-based agents that:
cat, ls, mkdir)Rscript, make, git)They use the exact same tools you learn this semester.
If you understand the toolkit, you can:
Ghost text — inline completions as you type:
Tab to accept, Esc to dismissParticularly useful for:
ggplot2 plots from natural language descriptionsdata.table syntax (:=, .SD, by)We will use Copilot throughout this course — it makes you faster, not lazier.
Covered in the previous session — remember the 8 building blocks from Gentzkow & Shapiro:
Automation, Version Control, Directories, Keys, Abstraction, Documentation, Management, Code Style
Everything we do today serves these principles.
Shell / Bash
Navigation, Files and Directories
username denotes a specific userhostname denotes name of the computer:~ denotes the directory path (where ~ signifies the user’s home directory)$ denotes the start of the command prompt (# for root)command option(s) argument(s)
pwd to print working directorycd to change directorytouch and mkdirrm-r or -R) and “force” (-f) optionscp object path/copyname (keeps old name if not provided with new one)mv object path/newobjectnameWorking with Text and Pipes
cat (“concatenate”)head and tailgrep (“Global regular expression print”)>>> (> overwrites)|
Loops and Scripting
.sh files with code can be executed#!/bin/sh is a shebang, indicating which program to run the command withRscriptMake
make automates the sequence from raw data → results → paperMakefile# Makefile
paper.pdf: paper.tex figures/plot.png
pdflatex paper.tex
figures/plot.png: output/results.csv code/plot.R
Rscript code/plot.R
output/results.csv: input/data.csv code/analysis.R
Rscript code/analysis.R
make and it figures out what needs rebuildingGit (Refresher)
Module 01 introduced version control — here are the commands you’ll use daily:
git diff shows exactly what changed since the last commitR Basics
==), matching (%in%)
all.equal()= or <-help(plot) or ?plot#data.table and tibble)[]$$ (continued)$ and the Global Environmentlm()pacman — single-line install + load; good for reproducible teaching setups%>%%>% (magrittr pipe) throughout this course|> — same idea, no import neededYou’ll encounter all three — they’re all rectangular data, with different trade-offs:
data.frame |
tibble |
data.table |
|
|---|---|---|---|
| Package | base R | tidyverse | data.table |
| all rows | fits screen | fits screen | |
| Speed | base | base | very fast |
| Syntax | df[row, col] |
dplyr verbs | dt[i, j, by] |
| Best for | small data | tidy pipelines | large data |
tibble is a data.frame with nicer defaultsdata.table modifies in place — crucial for memory on large data.qmd file combines prose, code, and output.Rmd) for new projectsWrap Up
code/00-shell-cheatsheet.md in this module