Web Scraping and the IEA Database

Wait! What is web scraping? Learn more about it here.

As part of the COMPLIANCE project funded by the BMBF (the German Federal Ministry of Education and Research), I have been working on a research project that examines the evolution of the design of international environmental treaties (IEAs).

As part of this project, I have been using data from the IEA Database, which is an amazing database put together by Ronald Mitchell (University of Oregon) in 2002 which includes over 1,300 MEAs, over 2,200 BEAs, and 250 other environmental agreements that are constantly being revised and updated.

My first thought was to figure out a way to web scrap the data I needed.

I love web scraping. It has an entry cost, but it can save you a crazy amount of time and spare you a lot of efforts.

Unfortunately, the IEA Database was not designed to make things easy for me at the time. So instead of going through the hassle of collecting the data manually, I decided to complain about it (and ended up collecting the data manually anyway).

But luckily for the future generations, Prof Mitchell is committed to make our lives easier, and after a few exchanges about what could be improved in this regard, it is now possible to retrieve data from the IEA Database using simple web scraping techniques (You are welcome).

Hence, I wanted to propose a series of web scraping tutorials (alongside with the code) specifically designed to retrieve data from the IEA Database using Python.

If you are looking for more general web scraping tutorials, Julia Kho, The Big Data Guy and Sicelo Masango are already proposing really cool guides for beginners. Check them out!

Web scraping the IEA Database with Python

Tutorial #1: Web scraping IEAs texts from the IEA Database [GitHub]

Tutorial #2: Web scraping membership data from the IEA Database (number of members at a specific date) [GitHub]

If there is a specific tutorial you would like to see here or you have a piece of code that you would like to share (even if you are using a different language) don't hesitate to contact me at alice.solda@awi.uni-heidelberg.de