The Correlation Fallacy

Did you know that the divorce rate in Maine and the per capita consumption of margarine are related.

It’s true.

Whenever one goes up, so does the other. When one goes down, same thing.

So, is margarine consumption a result of divorces in Maine? Do the prospects of court deliberations, split assets and alimony have Mainers running to the store for come Country Crock with their lobster dinner?

Not necessarily.

The Maine divorce rate-margarine consumption is a prime example of the adage correlation is not causation.

In other words, even if two things appear to be alike, they might not be related at all.

We’ve heard this time and again. Yet we continue to search for correlations, seemingly everywhere.

This has as much to do with innovation as anything.

With the growth of technology and the proliferation of big data sets, we have more raw records to peruse than ever before. More than we know what to do with.

There is no guidebook for turning this data into intelligible information. No rinse-and-repeat process to transform the data at hand into knowledge and solutions to make the world a better place.

With no roadmap to follow, we try to find needles in haystacks. We dive into the data, trying to find whatever relationships we can.

On the surface, this seems innocent enough. And it would be — if we were robots. Or Spock.

But we’re not.

We’re humans. Hot blooded, emotion-driven and filled with inherent biases.

A search for meaning is at the heart of our actions. We’re hard wired for this quest.

So, a simple dive through terabytes of data is actually a complex treasure hunt for causality. The objective: Find relationships that support our assertions and complete our narratives.

Instead of panning for gold, we’re data mining for affirmations. We’re finding whatever ammunition we can to support five words: I’m right and you’re wrong.

Those words are subjective. But with more access to data than ever before, we feel we have license to treat them as objective. Even if we must violate the correlation fallacy to do so.

This is how we end up with a world of alternative facts. A world of filter bubbles, chronic mistrust and divisiveness.

All because we refuse to abide by the rules of data assessment.


The world of statistics is filled with obscure names. While the dawning of America made the names Washington, Jefferson and Franklin renowned, fewer people know of Bayes, Boole, Pearson and Box.

The difference is as unsurprising as it is stark. One group of historic figures addressed its audience as We the People and spoke of Life, Liberty and the Pursuit of Happiness. The other group came up with hypotheses and then rejected — or failed to reject — them using math.

One group did work that was invigorating and captivating. (Heck, they even made Broadway hip-hop musicals about it.) The other did work that was arcane and ambiguous.

It’s no surprise that we’re drawn to the narrative of the Founding Fathers over that of the Fathers of Statistics. The underdog story of how the United States came to be has spawned centuries of free enterprise, free speech and freedom to pursue the American dream. The story of statistics has left us running regressions in Excel and figuring out how Z-scores work on a normal distribution.

Yet, ideas and ideals can only get us so far. While it’s a blessing to live in a free society, it’s also true that hopes, dreams and $3 can get us a cup of coffee at Starbucks.

In order to thrive, we must be able to quantify our impact. Use of data is critical.

This is why the government has a Census every 10 years. It’s why companies and investors track their stock market performance. It’s why we monitor the number of steps we take when we exercise.

We are effectively data-driven. Particularly when something is up for debate.

When we need answers quick, there are few resources to turn to that are more universal than numbers. The strategy is simple: Pull the right data. Win the argument. Seize the day.

Yet, in our zeal to make data our Excalibur, we forget one key point. Statistics are not set up to be definitive.

On the contrary, they’re intentionally ambiguous.

There are too many strange factors out there — from freak occurrences to that which we cannot explain — for us to confidently say that a set of statistical equations can explain the whole world around us. It’s just not true.

The best we can do is point out which factors are related to — or correlated with — other factors. And then use that knowledge to make our arguments.

When we do this, time after time, we say we’re letting the numbers speak.

But the numbers are not speaking. Our inherent bias is.

By looking to settle a debate, we dive into the numbers with a narrative in mind. The correlations and relationships we find are those that either fulfill our narrative or reframe it in a way that still paints it in a positive light.

This is sleazy enough when it comes to matters of opinion. (Hence the issues with the filter bubble society we live in.) But it’s downright reckless when it comes to matters of healthcare treatment, financial wellness, security and public policy.

The decisions we affect in these areas have wide ranging implications. Whether our role is that of an industry professional, a politician, a journalist, a civic voter or something else, a subjective set of correlation analyses won’t cut it.

Yet, time and again, that’s what key decisions are made on. And we suffer the consequences, whether we notice them or not.


It’s time we break with this destructive pattern.

It’s time we stop treating statistics as our white horse, and correlations as our armor.

It’s time that we get some common sense.

When making key decisions, key arguments and key points, let us do more than hold blindly to the data.

Let us open our eyes and consider what’s going on in the world around us.

Let us consider opposing viewpoints, and how they might be valid.

Let us treat learning as discovery, not validation.

It’s only when we do all that that the data speak in volumes. It’s only when we do all this that the resulting decisions bring the most good.

Statistics are a powerful tool, but a delicate one.

Handle with care.

Analyst or Innovator?

When I was growing up, I loved baseball. I loved playing it. I loved watching it. But most of all, I loved checking out baseball statistics.

Even though I was no math whiz, my young mind recognized that those numbers I saw in the newspaper box scores were actually a barometer. A player who batted to a .330 average with 30 Home Runs and 100 Runs Batted In would be someone I’d want to see starting for my favorite team. One who batted .210 with 5 homers and 25 RBI would not.

Whenever I saw those guys with poor statistics in a box score, I responded with bemusement. Why would a team run a player out there who hadn’t proven he could hit?

Of course, I failed to consider the ancillary reasons for those low numbers. Maybe the player was known for his outstanding defense. Maybe he was anxious because his wife was due any day with their first child. Maybe he was suffering from colitis but trying to tough it out anyway.

These scenarios wouldn’t erase goose eggs in a box score. But they would put them into context.

In particular, they had the power to integrate the human element into an industry based on numerical benchmarks. And given baseball’s legacy of pageantry and tradition, this element was sorely needed.

***

Sadly, that human element is harder to find these days.

It’s long gone from baseball. Statisticians are now an integral part of the sport’s brain trust, and players are judged on obscure metrics like WAR, Exit Velocity, Launch Angle and Spin Rate. (Sometimes, when I tune in to a baseball broadcast, I feel like I’m watching cyborgs.)

But it’s disappeared from many other industries as well. Big data is in vogue and seemingly every decision out there comes from cold, hard numbers. A whole new class of employees spend their days looking at analytics and reporting to their bosses solely on those very same numbers. They might not know it, but these analysts are now the key cogs that define their employers’ strategies.

This all seems well and good on the surface. More young adults can now have access to corporate jobs that actually impact their employers’ strategies. And companies don’t have to gamble with profitability each time they change things up; the cold, hard data is within arm’s reach.

But dig a little deeper, and you’ll find the quandary.

***

We were never meant to take the human element out of the equation. Anyone who’s watched Star Trek knows that instinct and emotion are just as critical as logic in completing our mission.

On a high level, our love affair with data-based decision making excludes us from any growth opportunities that require breaking from the norm, or bending the rules. It sacrifices our independence of thought in favor of hard numbers, thereby compromising our integrity.

But on a more basic level, our all-in data approach has created a new class of professionals. A class that is as stuck in the mud as Joe Pesci was in My Cousin Vinny.

You see, it’s relatively easy to analyze data that’s already there. Assuming one has a certain level of specialization, it’s even a secure area to work in.

But this type of occupation doesn’t provide a great opportunity for growth. There’s no need to go beyond the numbers. After all, no one’s looking for us to do that.

***

We were meant for something greater. We weren’t meant to be analysts. We were meant to be innovators.

And while the world at large seems to be pulling in the other direction, we don’t have to follow suit.

We have more to contribute than the digits on our spreadsheets and the colored arrows on our charts. There are untold stories behind those trends and totals. Stories that tie the often-unpredictable course of human psychology to the concrete data we cultivate like corn on a Nebraska field.

We must tell those stories to tie everything together. We must tell these stories to forge a new way forward for a society that has doubled down on a solitary variable. We must tell these stories to lead.

This process might seem uncomfortable. Unsafe even.

That’s OK. Innovators never take the well-worn path.

But regardless of our apprehension, we owe it to ourselves to explore our true potential. We owe it to humanity to take that leap. We owe it to our future to make the right choice.

Analyst or innovator?

The answer should be clear.