TU-L0022 - Statistical Research Methods D, Lecture, 25.10.2022-29.3.2023
This course space end date is set to 29.03.2023 Search Courses: TU-L0022
Causal inference (9:35)
In a typical quantitative study, we observe a correlation or another statistical association and want to make a causal claim. We do so because causal claims have policy implications such as "if you want your company to be profitable, name a woman as a CEO." This is also important because even if we do not provide any causal interpretations of a research study, the readers of the research might nevertheless interpret correlational results as causal. To make a causal claim, three conditions must be satisfied:
- Association - the cause and effect must be statistically associated.
- The direction of influence - the cause must precede the effect in our research design.
- Elimination of rival explanations - we must consider all relevant other causes that might produce the association and rule them out.
Eliminating rival explanations is the hard part. Randomized experiments and statistical controls are the two most common strategies for eliminating rival explanations.
Link to slides: https://osf.io/ejyrh
Click to view the transcript
After we have established that there is a statistical association in the population, the next step in research is typically causal inference. We want to say that there is actually a variable x that causes a variable y, instead of a mere statistical association.
Let's go back to our example of the Talouselämä 500 list and the difference between men and women-led companies. Let's assume now that we want to make the claim that naming a woman as a CEO causes profitability to increase. So we can attribute this profitability difference to women CEOs.
Now, why would these kinds of causal claims be important? There are two reasons:
1) First, causal claims allow us to make policy recommendations. For example, if we can make a valid causal claim, then we can claim that we should increase the number of women as CEOs.
2) Second, another important reason for making causal claims is that if we don't make a causal claim, someone else will interpret our results causally.
When this difference was originally published in 2005, there were many discussions online and in various newspapers about whether, based on this result, we should nominate more women as CEOs.
Here's another example: there's a report that companies led by women are more profitable. This isn't a unique observation. McKinsey showed that there's a difference between men- and women-led companies; women-led companies are more profitable. And while this profitability difference doesn't allow us to make a causal claim, they nevertheless think that there should be a policy recommendation to have more women on boards or as CEOs. When someone reads that kind of claim, they might interpret that to improve financial performance, companies should nominate women to a CEO position. So, people will make causal interpretations of your data. You either have to make interpretations yourself to guarantee that it's valid, or you have to explicitly caution that it's not a causal relationship, like McKinsey did.
So, how do we make a causal claim then? We have identified that there is a difference of 4.7 percentage points, and let's say this can't be by chance alone. So there is a consistent association: women-led companies are more profitable than men-led companies. How do we know it's a causal effect? We have to ask why there's a difference. We need different explanations to rule out alternative theories. Is it because women-led companies are more profitable due to the CEO gender, or is there another reason? To answer that, we need a theory, which offers a set of propositions or claims that explain what happens and why.
Several potential explanations are:
1) A woman as a CEO causes firm performance. This isn't a direct effect; it's that women facilitate better top management teamwork, which leads to better firm performance.
2) Smaller companies are more profitable and are more likely to hire women, leading to a spurious relationship.
3) Certain industries are more profitable and more likely to hire women.
4) There could be a reverse causation; a company being profitable might afford to hire a woman.
5) The CEO's gender affects company performance. Given the discrimination against women in CEO decisions - only 22 out of 500 companies were led by a woman - the last woman to be a CEO is likely better than the last man due to the sample size.
To investigate further, we need additional data on CEO gender, profitability, and more. Then we start ruling out these alternative explanations. Three conditions for causality are:
1) There must be a statistical association between cause (X) and effect (Y).
2) The cause always comes before the effect.
3) Rival explanations must be eliminated.
To rule out the direction, we simply measure the cause before the effect. To eliminate rival explanations, we can either use randomized assignment or statistical modeling. The former involves randomly assigning men and women as CEOs and observing profitability, while the latter involves modeling profitability as a function of CEO gender and other factors to identify the strongest performance predictor.