Monthly Archives: February 2014

shows reduction of preventable diseases

Bill Gates: Too many kids are dying, but we have the solutions

I love this graph because it shows that while the number of people dying fromOriginal treemap communicable diseases is still far too high, those numbers continue to come down.  In fact, fewer kids are dying, more kids are going to school and more diseases are on their way to being eliminated.  But there remains much to do to cut down the deaths in that yellow block even more dramatically.  We have the solutions.  But we need to keep the up support where they’re being deployed, and pressure to get them into places where they’re desperately needed.

- Bill Gates is Co-Chair of the Bill and Melinda Gates Foundation.

There is an interesting discussion about Thomas Porostocky’s infographic at Stephen Few’s site Perceptual Edge.  This is my contribution to that discussion.

Related information

Washington Post 27 Dec 2013 Source of Bill Gates’s quote and a copy of the Wired graphic.
Wired, Lee Simmons 15 Nov 2013 Source of the infographic and introduction to GBD Compare.
@BillGates 18 Nov 2013 Tweet the graphic to a large following
GBD Compare Source of the data; site supports interactive data  exploration.
Perceptual Edge 10 Jan 2014 Bryan Pierce and Stephen Few redesign the graph.

Key elements of Gates’s story

  1. The number of kids dying from preventable diseases… continue[s] to decline.
  2. Those numbers are still far too high.
  3. Fewer kids are dying.
  4. We have the solutions.
  5. But we need to keep the up support where they’re being deployed, and pressure to get them into places where they’re desperately needed.
  6. More diseases are on their way to being eliminated.
  7. More kids are going to school

Reviewing the key statements and related information generated the following.

  1. Continue[s] to decline:  Suggests a time-series: line plot deaths x year.
  2. Still far too high:  Needs a comparison group.  Try Developed vs Developing countries.
  3. Fewer kids are dying:  Filter to show only kids data.  Age < 15.
  4. We have the solutions:  Low death rates for Developed countries supports this.  They also provide an achievable target.
  5. Solutions deployed vs. desperately needed:  Countries differ in commitment.  Map showing countries color coded based on outcomes.  Variation in adjacent regions suggest governance issues.  Immunizations might add support to governance being an important factor.
  6. Diseases being eliminated:  Time-series of WHO diseases targeted for elimination.  Line per disease by developed vs. developing.  This is a component of 1 and may add more noise than signal.
  7. More kids are going to school:  This is an important but unrelated story.  Drop it.

Points 1-4 tell the main story focus on them.  Then see if 5 & 6 add to or distract from the main story.

The original Wired article by Lee Simmons has a secondary focus: introducing the IHME site and its data exploration capabilities.  The availability of a good interactive data site allows the redesign to focus on cleanly telling the core story.  The redesign can direct readers to the IHME GBD Compare tool to dig into the details. Treemaps are the primary visualization tool in GBD Compare which suggests that Porostocky may have used a treemap to provide a clean transition to the sub story.

What I see when I read the key story elements.

Gates in two lines with text

While this is how I conceptualize the graph it is not how I would present it.  The following graph is my publishable version paired with a loose restatement of the Gates quote.

I love this bittersweet graph.  It plots the Years of Life Lost (YLL) due to kids dying from Gates in two lines no textpreventable diseases.  The lower rates in developed countries are a testimony to the effectiveness of basic public health practices: immunizations, medications, clean water, and neonatal care. While the number of kids dying from these diseases in developing countries is still far too high, those numbers continue to come down.  There remains much to do to cut down the deaths in developing countries and speed the decent of the red line.

The map left color codes Africa YLLs by countrycountries based on level of YLL (red = high, blue = low).  The variation amongst adjacent counties indicates that governance matters. We have the solutions.  We need to keep up the support where they’re being deployed, and pressure to get them into places where they’re desperately needed.

See Wired for an introduction to these data and GBD Compare for detailed data exploration.

Related quote

An early graph showing the prevalence of deaths from preventable diseases in 1858.  Think of all the preventable Years of Life Lost in the ensuing 150 years, the places and the causes.  Those charted below occurred in hospitals.

The blue…represent… deaths from preventable diseases
…the red…deaths from wounds
…the black…deaths from all other causes
— Florence Nightingale 1858


Xbar-s chart

Control charts: a lesson in variation

A manufacturing plant has two machine operators with different styles. At shift start, both would carefully set up the machine and begin making parts. Operator A would measure a sample of parts each half hour and tune the machine accordingly. If the mean diameter was .002” oversize, Operator A would adjust the machine to cut .002” smaller. Operator B measured and adjusted the machine only when it was restarted (e.g. after maintenance, breaks, and lunch). The plant manager noticed the different approaches and measured a sample of parts made by each operator. Operator A’s parts showed more variation. Yes, methodical Operator A was making poorer parts.

Understanding the sources of variation explains this surprising outcome. The X-bar chart  plots the average and standard deviation of the diameters of five consecutive parts taken each half hour.

Xbar-s chart

These charts represent a simple and elegant use of statistics and logic. Manufacturing processes have two primary sources of variation, common and special causes. Common variation is inherent in the system; the only way to improve it is to get a new system (e.g. buy a better machine). The rest of the variation is due to special causes, which can be controlled.

The insight at the heart of control charts is that variation within subgroups is due to common causes while variation between subgroups is special cause variation. Consecutive parts minimize special cause variation. They are made from adjacent sections of raw material, by the same operator in a similar frame of mind, at similar ambient and coolant temperatures and with the machine near the same maintenance level.

The control chart software plots the mean of each sample and uses the within subgroup variation as the estimate of common variation. These estimates are used to calculate control limits such that points outside the control limits are a reliable indication that the process should be studied and adjusted.. If a sample mean falls outside the control limits, adjust the machine. If it falls inside the limits, it is in the range of normal machine variation; leave the process alone. The chart indicates that the machine was “in control” the entire 10 hour run. No adjustments needed, yet Operator A adjusted the machine 20 times. Special cause variation is not reduced by adjusting the machine. In fact, unnecessary machine adjustments are another source of special cause variation. Operators need to know when a sample indicates that the machine is no longer running well. The control limits (0.995 and 1.005 for mean, 0.0078 for standard deviation) provide the needed screening.

It is not enough to do your best; you must know what to do, and then do your best. — W. Edwards Deming


Photo of two rocks, larger 25 times heavier than smaller.

Drop two rocks

On your next walk pick up two rocks which vary in size; drop them simultaneously. Aristotle, in 350 BCE, stated that the rocks would fall with a speed directly proportional to their weight.


The large rock on the dinner plate is 25 times heavier than the small one. I dropped them from eye level. If Aristotle was correct, the large one should have hit the   ground about the time the small one passed my lips. Instead, they fell at the same speed and hit the ground at the same time. Apparently Aristotle never did this simple test; nor did anyone else until Galileo presented a thought experiment in 1628. For almost 2000 years the entire community of natural philosophers accepted and propagated this delusion. For good reason, this is how pre-scientific natural philosophy worked. Starting with a given set of truths, the philosopher used deductive reasoning to arrive at new truths. Aristotle’s reasoning was sound, the starting truths were wrong. Once a truth– it stayed a truth

Bias: we all have it.
In 11 of the past 20 years the Gallup poll has asked the following question:
Which of the following statements comes closest to your views on the origin and development of human beings –

  • human beings have developed over millions of years from less advanced forms of life, but God guided this process,                        (blue line,      mean =  37%)
  • human beings have developed over millions of years from less advanced forms of life, but God had no part in this process,         (orange line,  mean = 12%)
  • God created human beings pretty much in their present form at one time within the last 10,000 years or so?                                             (grey line,      mean = 45%)
  • No opinion                                                                     (yellow line,   mean =   6%)

GallopHumanOriginThe groups have maintained the same rank order in every poll for 20 years.  A Newsweek poll in 2007 had similar values and the same rank order.
The rocks I dropped as part of the Galileo story are millions of years old.  This information should eliminate the top line in the chart  and raise questions about the second line, but it hasn’t and it won’t for a very long time. The 80% of the population in the top two lines include teachers, doctors, sales people, business owners, and executives who routinely make valid data-based decisions on other issues. They are friends and neighbors. It is critical to understand that we each have beliefs which bias our data decisions.

If an analyst has made a choice, he has also made a value judgment – Jonathan Koomey