03 — Web scraping & APIs
I can haz all the data.
Slack channel: #03-web-scraping-api
This week is about data that is all around us online, but not easily accessible through csv
files or similar. Instead, while it is visible in plain side, we sometimes need to scrape information from websites.
Lecture slides
Morning session slides
Code
Check the course repository for the application.
We get these two wonderful plot:
Note that we dropped some countries because of some missing data for the exchange rates, but for now: ¯_(ツ)_/¯
Further recommended resources
the Billy index was (is?) indeed a thing a while ago, here’s Der Spiegel reporting about it: The Billy Instead Of The Big Macs
as usual, Grant McDermott has superb extra resources on webscraping: https://raw.githack.com/uo-ec607/lectures/master/06-web-css/06-web-css.html and https://raw.githack.com/uo-ec607/lectures/master/07-web-apis/07-web-apis.html
we used
rvest
for scraping, if you’re using Python as an alternative, you should definitely look into Beautiful Soupand finally the link to the Billion Prices Project