Data, Maps, Usability, and Performance

Making Maps from CIA World Factbook

Last updated on August 7, 2013 in Development

World Population Map for 2013

D3 World Map with Country Information
2013 World Population Map
2013 World Population Density Choropleth Map

The CIA world factbook is a great resource for information about various countries and territories. It can and it has been used often to create world maps for specific data points. The problem is that a lot of the data changes frequently and retrieving this content is not straightforward. The pages are not well formed and the data is not delivered in JSON or CSV formats. Instead, these great country metrics are just dumped into the content of the page and even thought the pages are nice and readable, the source is painful to scrape.

But people have been web scraping and mapping this data for quite a long time. One of the best implementations of this that I have seen is KMLfactbook because it offers many metrics, it transforms the data into CSV representation, and it visualizes the country data on a Google Map with tooltips. The big problem is that it is outdated with all the metrics collected from 2008. The same thing applies to all these other data extractions of the CIA Factbook:

Unfortunately, I could not use any of this data. I thought about plotting world populations but it’s disappointing to be 2 years behind with the data. Also, there are bigger problems, like new but big countries, such as South Sudan, which have recently formed (2011) and end up being a big gap on a world map. Finally, I noticed that some of these web scrapes where not very well sanitized before being transformed into a JSON object and there were quite a few errors.

So, I decided to use CasperJS and web scrape this data myself, using a process that can be repeated in the future:

  1. We retrieve and save all the countries from the CIA factbook to local files.
  2. Then we scrape local country files for text based metric: country background information into a JSON file.
  3. Then we scrape local country files for a number metric: country population data saved into a CSV file.

The first script is really simple and it should be run once a year to grab all the recent files from CIA factbook into a local repository. The next 2 scripts deal with specific metrics and I wanted to show an example of using text and a JSON file versus one data point in CSV format. You should decide which format is best for your needs, as I have decided that it makes more sense to store country descriptions in JSON and population numbers in CSV.

Now, let’s plot this data on a D3.js World Map. I have re-coded the Natural Earth topojson file to use CIA country codes as identifiers for easier mapping of CIA country data to the various country geographies. As a result, when someone hovers over a country I can filter my JSON file and find the appropriate country descriptions. Check out this demo here.

With population, I wanted to do a bit more work. I have generated a nice CSV of 2013 population metrics for each country but I wanted to use nice colors to visualize the data, provide a legend, and also show all the results in a table. The first problem I ran into was splitting up the data into buckets. D3 has a nice quantize scale function but world country population data is really not even. For example, China, which ranks #1, has over 1 billion more people than USA, which ranks #3.

So, I decided to add some manual work and actually figure out the thresholds for each bucket of countries and their colors. I looped through these buckets and colors to create a nice legend for the map. I was hoping to leverage this nice d3.legend script but it was easier to implement my own legend. That script works well when you have similar values across multiple countries but it would require some customization for my data. I have added the population data to each tooltip and looped through each country to dynamically generate an HTML table that shows all the results below the map.

Here is the D3.js World Map of Country Population.

Update:

I have also added a World Population Density Choropleth Map.

External:

US Census Visualization with D3.js
D3.js Unemployment US Map
D3 Choropleth with threshold scale for quantization
Fun World Maps
D3 Quantitative Scales
Geomapping, GeoJSON, Paths, and D3
Simple Choropleth style map with D3

Tags: , , , ,

Facebook Twitter Hacker News Reddit More...