TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022
Kurssiasetusten perusteella kurssi on päättynyt 06.04.2022 Etsi kursseja: TU-L0022
Interpretation of regression results (25:12)
This video explains how to interpret regression results, such as
R-squared statistic and beta coefficients in natural and standardized
units.
Click to view transcript
We
will now take a look at the interpretation of regression coefficients.
The actual interpretation of what the results mean is a more difficult
part than the calculation of the results.
Whenever you run a
regression analysis, the regression coefficients "beta" have to be
interpreted, because the readers of your research article don't know
what the betas mean, so you have to tell them. There are also other ways
in which the regression analysis can be quantified.
Regression
analysis tells us what is the direction of an effect and whether an
effect is statistically significant or not. What we want to know,
however, is whether the effects are large or not, and that depends on
the interpretation. In some contexts, a regression coefficient of 10 is
very large. In other contexts, a regression coefficient of 10 is very
small. So you have to consider the context and also, what are the
variables involved.
One of the easiest ways to start interpreting
regression analysis is to look at the R-squared statistic. So the
R-squared statistic is calculated based on the regression results and is
typically presented here on the bottom of the regression analysis
table.
Another related statistic is the adjusted R-squared. The
R-squared statistic tells us how much the independent variables together
explain the dependent variable, and it's an estimate of the quality of a
model in some sense. Sometimes it is referred to as goodness-of-fit of a
regression model or as a coefficient of determination. Most people just
refer to it as an R-squared. The R-squared varies between 0 and 1. "0"
means that independent variables don't explain the dependent variable at
all, "1" means that independent variables completely explain the
dependent variable.
One problem with R-squared is that it always
goes up when you add variables to a model. When your number of variables
starts to increase toward the number of observations, for example, if
you fit a model with 99 variables to 100 observations, the R-squared
will be exactly 1. So it always increases and goes up and it's
positively biased. The bias here means that, if we calculate the
regression analysis using sample data, the results can be expected to be
larger than if we run the same regression analysis on the full
population.
Because the R-squared is positively biased, we have
introduced the adjusted R-squared statistic, which penalizes complex
models. When your R-squared goes up, just because you have too many
variables in the model, then adjusted R-squared adjusts the R-squared
down to compensate for that bias. So it calculates an adjusted value,
and the adjustment is based on the number of observations and the sample
size. When the sample size is large and you have a very small number of
variables, for example, if you have 5 independent variables and 500
observations, you have 100 observations for each independent variable,
the adjustment is very small. If you have, let's say, 25 observations
and 100 units in your sample, then the adjustment is pretty large,
because you have only 4 observations for each independent variable.
One
problem is that the adjusted R-squared is not unbiased either, but it
can be expected to be less biased than the actual R-squared. To actually
get an unbiased estimate of the population R-squared is quite
difficult, so we don't normally do that.
The R-squared tells us
whether the model explains the data at all, so when R-squared is 0, then
it's the end of interpretation, the independent variables don't explain
the dependent variable at all.
Then the question is, how much is
a meaningful explanation? If you explain 1% of a phenomenon, in some
contexts that is meaningful, in other contexts, it's not meaningful. The
behavior of people and performance of organizations, it's very
difficult to predict or explain because it depends on so many different
things. And, therefore, in social sciences, the R-squared typically vary
in the 10-20-30% ballpark. If you have a 30% R-squared, then you have a
pretty good explanation, or you could also have a flawed study, but
we'll talk about that a bit later. So you have to consider the context.
In natural sciences R-squared of 99 percent could be considered not
large enough.
R-squared is useful for the first check of whether
the interpretation of the results further makes sense. If R-squared is
too small, then we know that none of these variables in the model
actually matter for the dependent variable. So interpreting the effects
of each independent variable separately is a waste of time.
Also,
the R-squared offers us an intuitive way of explaining whether the
results are large or not. If I tell you that the choice between three
investment strategies, for example, explains 30% of the variation of
your investment profits, then that's a big deal. We understand that 30%
is a big deal in that context. Because R-squared can be understood in
percentages, it has a natural interpretation for most people.
We'll
take a look at how Hekman uses the R-squared in his paper. Hekman
doesn't really interpret what the actual regression coefficients in
their study mean. But they are basing their interpretation of the
magnitude of the effects on the R-squared. And they're saying that
between their control-variables-only model, and the model where there
were the gender and race variables, the R-squared increases between 15
to 20%. That can be interpreted to mean that the effects of race and
gender are in the ballpark of 15 to 24%, assuming that there's no bias
in R-squared, which is not true. So they should really be looking at the
adjusted R-squared in this case.
But everyone understands that if we
say that the customer satisfaction score's variation, one-fourth of
that is explained by gender and race, everyone understands that that's a
big deal, everyone who understands percentages. It provides us with an
easy way of saying whether the results are of any practical meaning.
When
you have looked at the R-squared, the next thing that we want to know
is which of the individual variables matter. And that's where we get to
the interpretation of the regression coefficients. Let's take a look at
the Talouselämää 500 example. We have a sample where the women-led
companies are 4.7 percentage points more profitable than man-led
companies. And that's a big difference in ROA. We want to know whether
the difference is caused by a woman or whether it's caused by some third
factor. So we have to present alternative competing hypotheses. One
competing hypothesis is that it is not an effect of CEO gender, instead,
it's a spurious correlation caused by firm revenue. So that smaller
companies are more likely to hire women, and smaller companies are also
more profitable. Another competing hypothesis, or the second competing
hypotheses is that this is an industry difference. For example,
manufacturing companies are less profitable in ROA metric because ROA
depends on assets and these companies tend to have more assets than
service companies, and manufacturing companies are more likely to hire
male CEOs than women CEOs. So we have the other variable here.
Now,
regression analysis tells us, what is the effect of CEO gender ceteris
paribus, which is an economics term for holding other variables
constant. So when the CEO gender changes from zero indicating man to one
indicating a woman, what is the expected increase in return on assets.
Holding things constant means that you are comparing two cases that are
exactly comparable on the other variables. So if we have two companies
that are of the same size and same industry, then woman-led companies on
average beta 1 more profitable. So the regression coefficient directly
tells us, what is the profitability difference. If it's 1 percentage
points, 2 percentage points or 3 percentage points, then it's up to us
to interpret whether it's a big effect or not. We know that 4.7
percentage points is a big difference, one point, probably not so big
difference.
Interpreting regression coefficients is relatively
straightforward when these variables have a meaningful unit. So we know
that ROA has a meaningful unit for managers. Everyone, if we said to a
manager that my company's ROA is 20%, they know that it's pretty good
for most industries. And we also know that a CEO is female, 1 it's a
woman, 0 it's a man, so it has some meaning for us.
Sometimes we
have units that don't really have any meanings, and that complicates the
interpretation. So let's take a look at this question. Does one unit
increase in education, does it pay off? We have a statement, a
regression result, that one unit increase in education leads to one unit
increase in salary. Is it a big deal? We would need to know what is the
unit of education, what is the unit in salary. Let's say that the unit
is education in years and salary is euros per year. So we say one year
increase in education leads to one year increase in annual salary. Does
it make a difference? I would think not, for most people. Pretty much
every people, no one really wants to go to school if you just get one
additional euro of income per year. So that way, it's not meaningful.
How
about 1-year increase leads to a 1000 euro increase in annual salary?
That's a more problematic question. If we consider Finland, where
salaries annually are in tens of thousands of euros, maybe in the lower
end, if you make 20 thousand per year, maybe 1000 is worth one year of
education, maybe not. It's 5%, depends on how much you like to go to
school. On the other hand, if this data were from a developing country,
where the annual salaries are in the 1000-2000 euro ballpark then one
euro increase in the annual salary is a big deal. You can double your
income basically in some cases, if you go to one additional year of
school. And that's a big thing for those people. So you have to think of
what are the units, what's the unit of the independent variable, what's
the unit of the dependent variable and what is the context that you're
evaluating the effect in.
What if we say that one year increase
leads to one Bitcoin increase in annual salary? So we get one additional
year of education and we get one Bitcoin per year more. Well, that's
more problematic because people don't have an intuitive understanding of
what is the value of Bitcoin. So obviously, when tell somebody that
"I'll give you a Bitcoin", then the first question they'll ask: "What's
the value of Bitcoin in euros?" So, in this case, we could convert the
value of Bitcoin to euro, so we can do a conversion and express the
regression coefficient in a way that's more understandable. Let's say
that one year increase leads to 3000 increase in annual salary. I don't
know what is the value of Bitcoin now but let's assume it's 3000 euros,
so then we know that it's probably a big deal for some people. So
sometimes we can convert the units to something that we can understand,
even if the original unit was something that we don't understand easily.
What
if we have a case of a unit that cannot be converted? So let's say that
our result is that one year increase leads to one Buckazoid increase in
annual salary. Buckazoid is a fictional currency in a computer game,
and I don't think that anyone has ever developed an exchange rate from
Buckazoids to euros. So we can't convert this effect into euros, so what
do we do? One way of dealing with this Buckazoid issue is that we have
to first understand what's the average salary in Buckazoids in this
fictional universe and also how much are the salaries dispersed. If we
say that I'll give you ten Buckazoids or I'll give you a million
Buckazoids, it doesn't really make sense unless we know what's the mean
income. If we know that the mean income in that fictional world is ten
Buckazoids, if we tell somebody that you'll get a million Buckazoids,
then a million Buckazoids is probably a lot. If we tell them that we
give you a million Buckazoids, and the annual income is a billion
Buckazoids, then not a big deal, as much.
To understand how the
variable varies we have to look at its mean and standard deviations. And
it's useful in this case when we have these variables that don't have
any naturally interpretable units, look at how it is distributed. So we
take a look at mean and standard deviation. Let's assume that in our
sample the income in Buckazoids is distributed normally. A normal
distribution implies that one standard deviation, two standard
deviations from the mean have a special interpretation. So in normal
distribution, 68% of observations are plus or minus one standard
deviation above the mean. So if we say that our income is one standard
deviation above the mean, then we know that we are solidly in the
high-income segments, so we are pretty well above the average. If we say
that our income is two standard deviations in Buckazoids above the
mean, then we know that we are in the top 2.5% of the income
distribution. We can also see that generally the effect of one standard
deviation increase is pretty big. So you're solidly here below mean, one
standard deviation takes you to the average. Then two standard
deviations, you are pretty rich, so you are in the top 2.5%.
So
standard deviation units can be useful for interpreting regression
analysis results. So if we say that one additional year of education
increases your income by one standard deviation in the Buckazoid units,
is it a large effect? Well, for people it is, but then we would have to
think what is the lifespan of these aliens? If they only live on average
one year, then a one-year investment in education is a huge deal for
them. So we have to think about the context again.
Let's take a
look at an empirical example. So this is the Deephouse's paper, and
Table 2 and Model 2 from the regression results. And we'll be
interpreting these purely through standard deviations. The dependent
variable ROA has a meaningful unit, but we'll just ignore it for now. So
we'll just be looking at standard deviations. Their regression
coefficient was -0.02 for the effect of strategic deviation on relative
return on assets. So is it a big effect? To understand that we would
need to understand what is the unit of strategic deviation, that's a
completely made-up number by them, so it doesn't have a meaning. ROA has
a meaning, but we'll just ignore it for now. We need to know, what are
the standard deviations of these variables? So the standard deviation of
ROA is 0.7, and a standard deviation of strategic deviation is 2.9.
That tells us that, if the data are normally distributed, then 95% of
the observations of ROA are plus or minus 1.4 units, that's two standard
deviations from the mean. The difference between the top 2.5% and
bottom 2.5% is then 2.8 units. So top 2.5, bottom 2.5, four standard
deviations, it's 2.8 units.
So what is the effect of strategic
deviation? One standard deviation increase of strategic deviation is
then 2.932 multiplied by -0.020 which equals -0.058 decrease in relative
ROA. Then we compare is this -0.058, is it larger than the 2.8 units?
So the full-scale is from the -2.5%, from the worst 2.5% to the best
2.5% is 2.8 units, and if you increase your strategic deviation by one
standard deviation, you get -0.058 decrease in ROA. So it's a smallish
effect.
We can also understand the effect size interpretation and
how it's reported by looking at this nice example about sauna. So when
we ask whether the sauna is warm or not, sauna is a Finnish thing, a
normal research paper would say that the temperature of the sauna is
statistically significantly different from normal room temperature. It
tells us that maybe the sauna is heating, maybe it's ready for going in,
maybe it's too hot, maybe it was on a day before and it's still
cooling. It doesn't really tell us anything about whether the sauna is
warm or not. And that's equivalent of saying that the effect of
strategic deviation on ROA is negatively and statistically significantly
different from zero. So the statistical significance just tells that
there is some effect, it doesn't tell us whether the effect is large
not. Then a slightly better answer is that the temperature of the sauna
is currently 80 degrees and comparable that the effect of strategic
deviation of ROA is -0.020. So that is useful for people who understand
what 80 degrees mean and what this -0.020 means. So most people who go
to sauna often know what 80 centigrade mean, but you can't assume that
the readers of your research study will understand your units, so you
have to explain what it means. So a really good answer to whether the
sauna is hot is to say that the temperature is currently 80 and then
tell that most people who go to the sauna regularly would say that the
sauna is too hot but they could still do it. So that quantifies that the
sauna is pretty hot, more so than just saying that it's 80 centigrade.
The same thing, you can say that the effect of ROA is -0.20, and the
difference between ROAs of top 25% and bottom 25%, 4 standard
deviations, is -12. So if you go from the least deviant to the most
deviant, it is 0.12, and the same scale for the ROA is 2.8. So we can
see that 0.12 is pretty small compared to 2.8, so the effect is quite
small. There are other things that you can do to improve your
profitability than to be more statistically deviant.
Let's take a
look at yet another example. So this is from Hekman's paper. And
Hekman's paper shows a regression table and now, these effects are the
number of patients in a panel, so how many people go to see a doctor is
-0.04, and the age of the doctor is -0.13, the regression coefficients.
Are these large effects or not? We would have to look at the correlation
table and standard deviations and means to understand what are these
large effects in a normal case. But this is actually not a normal case
because these are standardized regression coefficients. They don't
report it, but you can see it by comparing if you start to interpret
this effect of the number of patients in the panel, which is in the
thousands, and age, which is in the tens. You can see that the effect
sizes don't make any sense. Also, all these effects are varied between
-1 or 1, which is the typical range for a standardized regression
coefficient. They can be more or less but they are typically zero point
something or minus zero point something. So these are standardized
coefficients, which means that the data have been standardized. So every
variable has a standard deviation of 1 and a mean of 0 before
regression estimation. In that case, we estimate this directly as
standard deviations. One unit increase in physician productivity is
associated with beta 1 increase in patient satisfaction. So we say that
one standard deviation increase in physician productivity is associated
with one standardized increase in satisfaction. So we interpret directly
as standard deviations.
This looks like the way to do it always,
so it would simplify life to always use standardized estimates, but
that's actually not the case. I recommend that you never standardize a
variable that has a meaningful scale. So if you have euros or years or
something that makes sense to people as a unit, then don't standardize.
The reason for that is that standardized estimates depend on the scale
of the variables because the standard deviation is a sample standard
deviation.
So let's say that here the standard deviation of age
is 6.58 and the mean is 50.34, so the doctors are quite old. What would
happen if the doctors in this sample were actually newly graduated,
between 24 and 28, and the standard deviation would be 1? What would
happen is that the standardized regression coefficient for the same
effect would be only -0.02, which has a very different interpretation
from -0.14. So it's 7 times as small, it's the exact same effect, it's
just scaled differently. So the difference of scaling means that these
effects, 0.02 and 0.40, are not comparable. So standardization doesn't
make your results comparable. So if you can interpret the results
without standardization, it is always better to do so.
So a rule
of thumb, use standardization only if your variables, none of them, have
a natural scale. Otherwise, interpret the standard deviations' units
only for those variables for which a natural scale does not exist.