Correlation coefficients are used to measure the strength of the linear relationship between two variables.
Source: Investopedia
While I was researching best practices to design a scatter graph for one my digital marketing projects, I came across an important warning:
correlation does not imply causation.
What does it mean?
Simply that two variables can be correlated without one being the cause of the other. There might be a third variable (or even more), not plotted on the correlation chart, at the origin of this seemingly causal relationship.
Let’s take two examples.
1° SEO Analysis
If you analyse the correlation between the amount of keywords driving traffic to a set of websites via Google and the actual organic traffic of those websites (as displayed on the chart above), you see a strong correlation coefficient (= 0.842).
You can definitely say that the more keywords a site is ranking for on Google, the more organic traffic it will have, since the traffic from each search query ends up compounding (and organic traffic is solely generated by the accumulation of successful search queries).
So ranking keywords count has a causal impact on the evolution of organic traffic (which in turn helps more keywords to rank by increasing the overall site authority, due to the way the algorithm works).
2° Shark Attacks
If you analyse the correlation between sharks attacks frequency and ice cream consumption, you could be tempted to conclude that selling more ice cream results in more shark attacks.
Obviously the cause lies elsewhere: there are usually more people on the beach on hotter days, which draws sharks to the shore, where a lot of people happen to enjoy ice cream.
If holidaymakers reduced their ice cream consumption, it would not deter the sharks from attacking.
Another example would be the very strong correlation between days and nights. It’s actually a 1/1 perfect correlation. Of course we all know that days don’t cause nights (their alternance being the result of the rotation of the earth on its axis).
Nevertheless, it doesn’t mean that cause and correlation can’t be… correlated.
Recurring causes revealed in correlations
A recurring causal factor can of course be revealed in a correlation.
If we plot on a chart the count of deaths among the elderly vs. mean temperatures, we’ll see a correlation between very low / very high temperatures and an increase in mortality.
In this case, temperature extremes are part of the causal chain of events leading to more deaths. It might not be the temperature itself but it’s definitely a contributing factor, as were comorbidities like obesity or diabetes during the Covid pandemic.
A distinction should be made between a contributing (or reinforcing / aggravating) causal factor and a primal cause.
Embracing complexity
If we refer to a death from Covid, in the chain of unfortunate events the trigger was the infection by the virus. The poor health condition of a deceased patient had a causal impact on the outcome but it wasn’t the original cause of the fatality.
All complex systems (e.g. a living organism, a society, an industry,…) require a careful analysis of the cluster of causes leading to an effect. The objective should always be to weight each of those factors and determine their position in the chain of events, not necessarily linear.
Some of the impacts might be simultaneous. They might flow from different directions, not always gently aligned one after the other. Think about the way your attention is challenged in a 360° VR immersive game.
Complexity should be evaluated as constantly evolving in four dimensions (x,y,z + t).
Complex systems are systems whose behaviour is intrinsically difficult to model due to the dependencies, competitions, relationships, or other types of interactions between their parts or between a given system and its environment. Source: Wikipedia
Complex systems face both a chain of internal reactions and a continuous reaction to external influences. They can also in return exert a feedback on their environment.
Some of the properties of complex systems emerge, in a causal manner, from the interaction of their tiniest components, whose behaviour can be seen as elegantly intertwined.
Vision is for humans the result of a range of complex synaptic connections, which start with light being captured by approximately 126 million light-sensitive cells.
It is crucial to embrace the intricacies of emerging behaviours to properly address a chaotic issue. Environmental sustainability for instance can’t be achieved simply by tightening some bolts in our daily routine.
In a globally connected ecosystem, all the efforts, from the tiniest to the boldest, should be orchestrated in a symphonic symbiosis, even more importantly when you acknowledge that it’s by definition impossible to reverse the course of events.