01 — Getting started
Course outlook and good research practice.
Slack channel: #01-getting-started
Welcome
Welcome to the course “Data Science in International Economics Research”. To get the course up and running, here are a few things to do.
Join the course’s Slack Workspace: datascience2024.slack.com. We’ll use this as our main communication tool.
We strongly recommend you sign up for a free Github student account (if you don’t already have one). This allows you to fork the course repository that contains all reproducible code and even the reproducible environment (more on this just below).1
You’re encouraged to use Visual Studio Code (abbreviated VSCode) in combination with the custom course Docker image to make sure all code runs neatly on “your” machine. You’re of course welcome to use another environment (e.g. RStudio) at your own peril. More on our recommended setup below.
Course repository
The course repository on Github will contain all code that we produce over the course of the semester. It also allows you to fork and contribute. Your forked copy of the repositoy should also be the place where you work on your course project. More on how Git and Github work will be discussed in the lecture.
To sign up for Github, follow instructions here: https://github.com/signup
Recommended setup
The afternoon application sessions will require you to follow along and implement yourself coding examples that make use of data and methods discussed in the morning session. In order to make sure we don’t spend enormous amounts of time to get everything up and running on everyone’s personal machine, we have prepared a Dockerfile that creates a reproducible environment.2
So, here are the instructions to install Docker and VSCode.
Docker
To install Docker, go to
https://docs.docker.com/get-docker/, pick your operating system and follow the setup instructions.
Mac: Be careful in choosing the right chipset. Older Macs have Intel chips, more recent ones likely ARM (“Apple Silicon”). Also, you may also simply install Docker via
brew install docker
if that’s your thing.Windows: You may need to enable virtualization in your BIOS. If you don’t know how to do this let us know in #it-questions.
Linux: There are precompiled binaries, but these are still in beta. Depending on your distribution, you may want to install via the command line. You may also need to add your username to the
docker
user group. As you’re a Linux user, we think you know how to do this. If not, let us know in #it-questions.
Once installed, the opened Docker Desktop app looks like this:
You likely won’t ever need to do anything here, it’s just important it runs smoothly in the background.
VSCode
Installing VSCode is just as easy. Go to
https://code.visualstudio.com, choose your OS, download the binaries, follow the instructions and you’re good to go.
Mac and Linux: Of course you can install VSCode also via the command line, instructions for Linux are here, for Mac just use
brew install vscode
.
Opening VSCode should greet you like this:
Finally, install the “Remote - Containers” extension — probably VSCode asks you right away if you want to install it. You may also install other extensions that you may find useful. You can/should also log in with your Github credentials, which allows you to commit/push/pull changes3 and syncs settings across machines.
Optional but recommended: Github Desktop
Installing the Github Desktop app is optional, but highly recommended if you’re a Github novice. Go to
Mac and Windows: Go to https://desktop.github.com and follow instructions.
Linux: Follow instructions here: https://github.com/shiftkey/desktop
Once installed, open the Github Desktop app to see this window:
To get started, clone the course repository:
Finally, open the repository in VSCode:
Voila!
Note: Macs with Apple Silicon may need to have Rosetta 2 enabled. You can do so with
softwareupdate --install-rosetta
You also may then need to manually pull the required docker image with
docker pull --platform linux/x86_64 ghcr.io/rocker-org/devcontainer/geospatial:4.3
After that, the rest should work automatically.
Lecture slides
Afternoon session slides
Further resources
- Karl Broman’s “initial steps toward reproducible research”: https://kbroman.org/steps2rr/
- Grant McDermott’s introduction to Git: https://raw.githack.com/uo-ec607/lectures/master/02-git/02-Git.html
- A bit more advanced, an introduction to Git for CS at MIT: https://missing.csail.mit.edu/2020/version-control/