Even if you've never heard of Sir Francis Galton, you're probably familiar with some of his work. He is credited with making the first weather map. He's responsible for popularizing the termsnature and nurture to refer to traits that are inherited versus created by one's environment. In the second half of his career, inspired by his half-cousin, Charles Darwin's work, Galton turned his focus toeugenics. His contributions to the advancement of science and his focus on good breeding make him not unlike a real-life Voldemort, one who did"great things – terrible, yes, but great."
Regression to the mean
Aside from his more odious contributions to breeding the ideal race, one of Galton's most lasting achievements is a simple concept that we rarely stop to notice: "Regression to the mean"
Galton studied 930 adult children, as well as their parents, and found that unusually tall parents generally had children who were shorter than them. Unusually short parents generally had children who were taller than them. When the adults' and children's heights were plotted on a graph, there was much more variation in the adults' heights than the children's heights.
Galton termed this phenomenon of subsequent generations moving toward the average, "Regression towards mediocrity."
How attention to regression impacts our choices
Regression to the mean has had an impressive impact on not just the field of genetics, but on medicine, statistics, and really, any other subject in which there are measured averages. Understanding how regression works can also help us answer everyday questions.
Let's say that you and your spouse are planning your first vacation without the kids and want to pick a hotel. You start the search at your favorite travel site and find that Hotel A has an average of 4 stars, based on one thousand reviews. Hotel B has an average of 5 stars with one dozen reviews. The overall average review of hotels at your vacation destination is 3.5 stars.
Which hotel do you pick: the 4-star Hotel A or the 5-star Hotel B?
Although Hotel B appears to be the better option, you should pick Hotel A. The phenomenon at work here is regression to the mean. The large number of reviews for Hotel A suggests that it is an above-average hotel for that area. The reviews for Hotel B, although higher in average than the number of reviews for Hotel A, are considerably fewer in number. As Hotel B receives more and more reviews, it's likely that patrons will offer differing reviews.
Picking the right hotel is really important because you're dropping the kids at Grandma and Grandpa's house so that you and your spouse can have a crazy weekend in Vegas. At the craps table, your wife blows on your dice, and you have an amazing winning streak. But then, Lady Luck plied with too many free drinks meant to keep you both in the casino longer, seeks the ladies' room. You lose.
Was it your lucky charm leaving that led you to lose all of your winnings on that last toss of the dice?
No. It was regression to the mean. You had a run of abnormally successful tosses, and, with subsequent rolls of the dice, your results moved closer and closer to the mean. On average, the house wins, so as you play more games your cumulative losses are likely to outweigh your wins.
Now imagine that you're awoken early the next morning by your mom, who says that one of your kids has a fever, sore throat, and runny nose. You spend the next few hours contemplating an early return home when your mom calls back to tell you she gave your child some apple cider vinegar and now she's doing fine.
Should you credit that apple cider vinegar with curing your child?
No. You should instead credit regression to the mean. Your child felt lousy, but then she felt better. If you took a large population of people with colds and gave them all apple cider vinegar, within a few days you'd notice that many of those people got better. You wouldn't know whether the apple cider vinegar had anything to do with that improvement.
The gold standard of scientific research is the randomized controlled trial, in part because it features a control group and a test group, the study of which allows researchers to account for regression to the mean. Without a strong randomized controlled trial, there's no way to prove that the apple cider vinegar had any effect, because of regression to the mean.
Why we fail to identify regression to the mean
Why, given the concept's 150-year history and its apparent ubiquity, is regression to the mean often invisible to us? Galton's original terminology of "regression towards mediocrity" may hold one answer. Regression to the mean is, as scientific concepts go, one of the most disappointing because it offers the bland truth that eventually things average out.
Understanding regression could have helped you from being taken in by reviews that appeared too good to be true. Understanding regression to the mean when at the craps table could have sent you home with a lot more money. Understanding regression to the mean when on the phone with your parents could have saved you future dependence on a nonsense cure for the common cold. But telling the story of a Vegas vacation is much less exciting with regression to the mean.
You could come home from vacation and tell your friends that your hotel was as reliably comfortable as its reviews suggested, that you had a few lucky rolls and left while you were ahead and that your child had an illness which quickly ran its course.
Or you could tell them a version in which you had to escape a hotel from hell, your good luck charm left and ruined your winning streak, and your child miraculously improved after consuming fermented apples.
The second version is a much better story, and that's the one you tell your friends when you're back home. Regression to the mean that what happens in Vegas stays in Vegas.