Billionaires dataset — Documentation

mountain chart screenshot with data sources

1. Mountain shapes dataset

2. Billionaires dataset

This dataset is available to download for free:–gapminder–billionaires

Gapminder has scraped a list of billionaires from Forbes’ ‘The World’s Billionaires‘ and Hurun’s ‘Hurun Global Rich List‘ with the goal of combining both sources. Forbes has a detailed profile of each billionaire that has been on the list with their full name, age, source of wealth, country, and their most recent net worth. According to Forbes, they keep track of each billionaires’ moves and take into account their assets, including stakes in public and private companies, real estate, yachts, art, and cash. Hurun also provides information of the billionaires’ full name, age, the name of the company they work for, and their industry.

Gapminder has then processed the list from each source separately and given a unique ID to each of the listed billionaire (as a combination of their country code, birth year, and full name) to first identify the matches and then filter through unmatched names to check for existing duplicates. For duplicated ids, we kept the data from Forbes. The process of ID matching is conducted in this spreadsheet. You can follow through the various waves of filtering out the duplicates  by looking through the tabs in that file.

Next, we adjust the billionaires’ net worth. from different years to 2011 USD, and then calculate the daily income from net worth using 3% as annual capital return rate.

In each year the billionaires are binned by income brackets, which enables stacking them on top of each other.

The billionaires data is only available from 2002 to 2022.

The billionaires of Taiwan, HK and Macao are counted as Chinese billionaires.

3. Bridge shapes (Unknown rich persons)

Bridge shapes in the current version are NOT based on data and are used for illustrative purpose only. We make “the millionaire assumption” as follows:

the millionaire assumption: there are more millionaires than billionaires

We would like to show the billionaires on the mountain chart but avoid making the impression that there is an income gap between them and the rest of the population. Instead, we assume, there is a long tail of high income people and only a tip of it is visible in the list of billionaires.

For billionaires we can check Forbes or Hurun list. But millionaires are not on those lists and they are so few comparing to the country population that an income survey would miss them too. Bridge shapes marked in the chart as “Unknown rich persons” show that there must be still people on the millionaire income levels even when we don’t know their names.

Currently the calculation goes like so: the bridge shapes follow stacked billionaire count in their income brackets starting from the right (richest) end until the bracket with max billionaire count is met. From this point the shape continues left, assuming there is a 10% increase in the number of people per each income bracket.

When there are too few billionaires on a chart for the displayed countries or regions, we don’t make the bridge shapes because we can’s make the millionaire assumption.

We are looking for a better way to estimate these bridge shapes. In the future we plan to refine them using the right-most values of mountain shapes (which are based on survey data) and left-most values of billionaire count.

4. Poverty lines dataset

Global extreme poverty line

2.15$/day as per World Bank September 2022 update

National poverty lines dataset

This is the minimum level of income deemed adequate in a particular country. It is converted to international dollars using purchasing power parity rate and is expressed in per capita terms per day. Full dataset and detailed methodology available here:

5. Homes dataset

The homes on mountain chart are the homes from See the about page, section “How were the $ values calculated?” to learn how we place the homes on the income axis. Conversion from Dollarstreet incomes in $/month per adult to $/day on mountain chart is a simple divide by 30.