Correlation and Causation Explained in the Real Word

All of us that have provided services that have fallen into a classification like  Decision Support System, Business Analyst, Business Intelligence, or Data Science have had to explain the difference between correlation and causation.  There have been times that I have found these conversations can be tense, confusing, and sometimes futile.

First, forget the academic explanation in the business world. We have all looked into the occasional blank stare. We have all seen the bored look at the watch. I have never had any real success with any reference to Bayes or entropy in the office cubicle.

I have, however, had success drawing a contrast between correlation, association, and causation in the business world using parables steeped in the business of the client. Let me give you two examples from two business domains where I have crunched a few numbers.

Let’s say you find a positive linear correlation between house prices and a certain feature. Homes with feature X tend to sell for Y dollars more. That price increase is more than the cost of feature X. The real estate agents begin to think advising clients to install X so home prices increase.  Time to caution the agents, but how?

I can’t speak for the market currently nor any market outside my locality, but about 30 years ago homes with a built-in vacuum system sold significantly more than homes without a built-in vacuum system. Did the built-in vacuum system cause the price increase? No. Those homes tended to be built in subdivisions with better school systems, better home owners associations, and better maintained homes. Real estate agents completely understood that dynamic and it helped them generalize to other features.

Let’s say your client is shipping product and tracking damages.  There is a positive correlation between a particular condition and the number of damaged units reported.  Is it safe to say removing that condition solves the problem?

Consider our mythical dairy farm. It uses two shippers: Smith Shipping and Thompson Trucking. Smith Shipping only reports 1 gallon in 500,000 as damaged.  Thompson Trucking reports 1 gallon in 120,000 as damaged.  It looks like Thompson Trucking is a way to damages. However, Smith Shipping is all tankers taking the milk to the packaging facility and Thompson Trucking is all reefers taking the packaged milk to the warehouse.

The bottom line to all of this is simple. Academics are great. Some of us, myself included, enjoy it. However, the vast majority of folks we serve are only concerned about getting a job done in the safest, cheapest, and quickest way possible.

Leave a Reply

Your email address will not be published. Required fields are marked *

Correlation and Causation Explained in the Real Word


All of us that have provided services that have fallen into a classification like  Decision Support System, Business Analyst, Business Intelligence, or Data Science have had to explain the difference between correlation and causation.  There have been times that I have found these conversations can be tense, confusing, and sometimes futile.

First, forget the academic explanation in the business world. We have all looked into the occasional blank stare. We have all seen the bored look at the watch. I have never had any real success with any reference to Bayes or entropy in the office cubicle.

I have, however, had success drawing a contrast between correlation, association, and causation in the business world using parables steeped in the business of the client. Let me give you two examples from two business domains where I have crunched a few numbers.

Let’s say you find a positive linear correlation between house prices and a certain feature. Homes with feature X tend to sell for Y dollars more. That price increase is more than the cost of feature X. The real estate agents begin to think advising clients to install X so home prices increase.  Time to caution the agents, but how?

I can’t speak for the market currently nor any market outside my locality, but about 30 years ago homes with a built-in vacuum system sold significantly more than homes without a built-in vacuum system. Did the built-in vacuum system cause the price increase? No. Those homes tended to be built in subdivisions with better school systems, better home owners associations, and better maintained homes. Real estate agents completely understood that dynamic and it helped them generalize to other features.

Let’s say your client is shipping product and tracking damages.  There is a positive correlation between a particular condition and the number of damaged units reported.  Is it safe to say removing that condition solves the problem?

Consider our mythical dairy farm. It uses two shippers: Smith Shipping and Thompson Trucking. Smith Shipping only reports 1 gallon in 500,000 as damaged.  Thompson Trucking reports 1 gallon in 120,000 as damaged.  It looks like Thompson Trucking is a way to damages. However, Smith Shipping is all tankers taking the milk to the packaging facility and Thompson Trucking is all reefers taking the packaged milk to the warehouse.

The bottom line to all of this is simple. Academics are great. Some of us, myself included, enjoy it. However, the vast majority of folks we serve are only concerned about getting a job done in the safest, cheapest, and quickest way possible.

,

Leave a Reply

Your email address will not be published. Required fields are marked *