TU-L0022_aalto-CUR-141790-3063741: Interpretation of regression results (25:12)

Etusivu Koulut Kurssipalaute Palvelulinkit Intelliboard

Kurssiasetusten perusteella kurssi on päättynyt 06.04.2022 Etsi kursseja: TU-L0022

Interpretation of regression results (25:12)

Vaatii arvosanan

This video explains how to interpret regression results, such as R-squared statistic and beta coefficients in natural and standardized units.

Click to view transcript

We will now take a look at the interpretation of regression coefficients. The actual interpretation of what the results mean is a more difficult part than the calculation of the results.

Whenever you run a regression analysis, the regression coefficients "beta" have to be interpreted, because the readers of your research article don't know what the betas mean, so you have to tell them. There are also other ways in which the regression analysis can be quantified.

Regression analysis tells us what is the direction of an effect and whether an effect is statistically significant or not. What we want to know, however, is whether the effects are large or not, and that depends on the interpretation. In some contexts, a regression coefficient of 10 is very large. In other contexts, a regression coefficient of 10 is very small. So you have to consider the context and also, what are the variables involved.

One of the easiest ways to start interpreting regression analysis is to look at the R-squared statistic. So the R-squared statistic is calculated based on the regression results and is typically presented here on the bottom of the regression analysis table.

Another related statistic is the adjusted R-squared. The R-squared statistic tells us how much the independent variables together explain the dependent variable, and it's an estimate of the quality of a model in some sense. Sometimes it is referred to as goodness-of-fit of a regression model or as a coefficient of determination. Most people just refer to it as an R-squared. The R-squared varies between 0 and 1. "0" means that independent variables don't explain the dependent variable at all, "1" means that independent variables completely explain the dependent variable.

One problem with R-squared is that it always goes up when you add variables to a model. When your number of variables starts to increase toward the number of observations, for example, if you fit a model with 99 variables to 100 observations, the R-squared will be exactly 1. So it always increases and goes up and it's positively biased. The bias here means that, if we calculate the regression analysis using sample data, the results can be expected to be larger than if we run the same regression analysis on the full population.

Because the R-squared is positively biased, we have introduced the adjusted R-squared statistic, which penalizes complex models. When your R-squared goes up, just because you have too many variables in the model, then adjusted R-squared adjusts the R-squared down to compensate for that bias. So it calculates an adjusted value, and the adjustment is based on the number of observations and the sample size. When the sample size is large and you have a very small number of variables, for example, if you have 5 independent variables and 500 observations, you have 100 observations for each independent variable, the adjustment is very small. If you have, let's say, 25 observations and 100 units in your sample, then the adjustment is pretty large, because you have only 4 observations for each independent variable.

One problem is that the adjusted R-squared is not unbiased either, but it can be expected to be less biased than the actual R-squared. To actually get an unbiased estimate of the population R-squared is quite difficult, so we don't normally do that.

The R-squared tells us whether the model explains the data at all, so when R-squared is 0, then it's the end of interpretation, the independent variables don't explain the dependent variable at all.

Then the question is, how much is a meaningful explanation? If you explain 1% of a phenomenon, in some contexts that is meaningful, in other contexts, it's not meaningful. The behavior of people and performance of organizations, it's very difficult to predict or explain because it depends on so many different things. And, therefore, in social sciences, the R-squared typically vary in the 10-20-30% ballpark. If you have a 30% R-squared, then you have a pretty good explanation, or you could also have a flawed study, but we'll talk about that a bit later. So you have to consider the context. In natural sciences R-squared of 99 percent could be considered not large enough.

R-squared is useful for the first check of whether the interpretation of the results further makes sense. If R-squared is too small, then we know that none of these variables in the model actually matter for the dependent variable. So interpreting the effects of each independent variable separately is a waste of time.

Also, the R-squared offers us an intuitive way of explaining whether the results are large or not. If I tell you that the choice between three investment strategies, for example, explains 30% of the variation of your investment profits, then that's a big deal. We understand that 30% is a big deal in that context. Because R-squared can be understood in percentages, it has a natural interpretation for most people.

We'll take a look at how Hekman uses the R-squared in his paper. Hekman doesn't really interpret what the actual regression coefficients in their study mean. But they are basing their interpretation of the magnitude of the effects on the R-squared. And they're saying that between their control-variables-only model, and the model where there were the gender and race variables, the R-squared increases between 15 to 20%. That can be interpreted to mean that the effects of race and gender are in the ballpark of 15 to 24%, assuming that there's no bias in R-squared, which is not true. So they should really be looking at the adjusted R-squared in this case.
But everyone understands that if we say that the customer satisfaction score's variation, one-fourth of that is explained by gender and race, everyone understands that that's a big deal, everyone who understands percentages. It provides us with an easy way of saying whether the results are of any practical meaning.

When you have looked at the R-squared, the next thing that we want to know is which of the individual variables matter. And that's where we get to the interpretation of the regression coefficients. Let's take a look at the Talouselämää 500 example. We have a sample where the women-led companies are 4.7 percentage points more profitable than man-led companies. And that's a big difference in ROA. We want to know whether the difference is caused by a woman or whether it's caused by some third factor. So we have to present alternative competing hypotheses. One competing hypothesis is that it is not an effect of CEO gender, instead, it's a spurious correlation caused by firm revenue. So that smaller companies are more likely to hire women, and smaller companies are also more profitable. Another competing hypothesis, or the second competing hypotheses is that this is an industry difference. For example, manufacturing companies are less profitable in ROA metric because ROA depends on assets and these companies tend to have more assets than service companies, and manufacturing companies are more likely to hire male CEOs than women CEOs. So we have the other variable here.

Now, regression analysis tells us, what is the effect of CEO gender ceteris paribus, which is an economics term for holding other variables constant. So when the CEO gender changes from zero indicating man to one indicating a woman, what is the expected increase in return on assets. Holding things constant means that you are comparing two cases that are exactly comparable on the other variables. So if we have two companies that are of the same size and same industry, then woman-led companies on average beta 1 more profitable. So the regression coefficient directly tells us, what is the profitability difference. If it's 1 percentage points, 2 percentage points or 3 percentage points, then it's up to us to interpret whether it's a big effect or not. We know that 4.7 percentage points is a big difference, one point, probably not so big difference.

Interpreting regression coefficients is relatively straightforward when these variables have a meaningful unit. So we know that ROA has a meaningful unit for managers. Everyone, if we said to a manager that my company's ROA is 20%, they know that it's pretty good for most industries. And we also know that a CEO is female, 1 it's a woman, 0 it's a man, so it has some meaning for us.

Sometimes we have units that don't really have any meanings, and that complicates the interpretation. So let's take a look at this question. Does one unit increase in education, does it pay off? We have a statement, a regression result, that one unit increase in education leads to one unit increase in salary. Is it a big deal? We would need to know what is the unit of education, what is the unit in salary. Let's say that the unit is education in years and salary is euros per year. So we say one year increase in education leads to one year increase in annual salary. Does it make a difference? I would think not, for most people. Pretty much every people, no one really wants to go to school if you just get one additional euro of income per year. So that way, it's not meaningful.

How about 1-year increase leads to a 1000 euro increase in annual salary? That's a more problematic question. If we consider Finland, where salaries annually are in tens of thousands of euros, maybe in the lower end, if you make 20 thousand per year, maybe 1000 is worth one year of education, maybe not. It's 5%, depends on how much you like to go to school. On the other hand, if this data were from a developing country, where the annual salaries are in the 1000-2000 euro ballpark then one euro increase in the annual salary is a big deal. You can double your income basically in some cases, if you go to one additional year of school. And that's a big thing for those people. So you have to think of what are the units, what's the unit of the independent variable, what's the unit of the dependent variable and what is the context that you're evaluating the effect in.

What if we say that one year increase leads to one Bitcoin increase in annual salary? So we get one additional year of education and we get one Bitcoin per year more. Well, that's more problematic because people don't have an intuitive understanding of what is the value of Bitcoin. So obviously, when tell somebody that "I'll give you a Bitcoin", then the first question they'll ask: "What's the value of Bitcoin in euros?" So, in this case, we could convert the value of Bitcoin to euro, so we can do a conversion and express the regression coefficient in a way that's more understandable. Let's say that one year increase leads to 3000 increase in annual salary. I don't know what is the value of Bitcoin now but let's assume it's 3000 euros, so then we know that it's probably a big deal for some people. So sometimes we can convert the units to something that we can understand, even if the original unit was something that we don't understand easily.

What if we have a case of a unit that cannot be converted? So let's say that our result is that one year increase leads to one Buckazoid increase in annual salary. Buckazoid is a fictional currency in a computer game, and I don't think that anyone has ever developed an exchange rate from Buckazoids to euros. So we can't convert this effect into euros, so what do we do? One way of dealing with this Buckazoid issue is that we have to first understand what's the average salary in Buckazoids in this fictional universe and also how much are the salaries dispersed. If we say that I'll give you ten Buckazoids or I'll give you a million Buckazoids, it doesn't really make sense unless we know what's the mean income. If we know that the mean income in that fictional world is ten Buckazoids, if we tell somebody that you'll get a million Buckazoids, then a million Buckazoids is probably a lot. If we tell them that we give you a million Buckazoids, and the annual income is a billion Buckazoids, then not a big deal, as much.

To understand how the variable varies we have to look at its mean and standard deviations. And it's useful in this case when we have these variables that don't have any naturally interpretable units, look at how it is distributed. So we take a look at mean and standard deviation. Let's assume that in our sample the income in Buckazoids is distributed normally. A normal distribution implies that one standard deviation, two standard deviations from the mean have a special interpretation. So in normal distribution, 68% of observations are plus or minus one standard deviation above the mean. So if we say that our income is one standard deviation above the mean, then we know that we are solidly in the high-income segments, so we are pretty well above the average. If we say that our income is two standard deviations in Buckazoids above the mean, then we know that we are in the top 2.5% of the income distribution. We can also see that generally the effect of one standard deviation increase is pretty big. So you're solidly here below mean, one standard deviation takes you to the average. Then two standard deviations, you are pretty rich, so you are in the top 2.5%.

So standard deviation units can be useful for interpreting regression analysis results. So if we say that one additional year of education increases your income by one standard deviation in the Buckazoid units, is it a large effect? Well, for people it is, but then we would have to think what is the lifespan of these aliens? If they only live on average one year, then a one-year investment in education is a huge deal for them. So we have to think about the context again.

Let's take a look at an empirical example. So this is the Deephouse's paper, and Table 2 and Model 2 from the regression results. And we'll be interpreting these purely through standard deviations. The dependent variable ROA has a meaningful unit, but we'll just ignore it for now. So we'll just be looking at standard deviations. Their regression coefficient was -0.02 for the effect of strategic deviation on relative return on assets. So is it a big effect? To understand that we would need to understand what is the unit of strategic deviation, that's a completely made-up number by them, so it doesn't have a meaning. ROA has a meaning, but we'll just ignore it for now. We need to know, what are the standard deviations of these variables? So the standard deviation of ROA is 0.7, and a standard deviation of strategic deviation is 2.9. That tells us that, if the data are normally distributed, then 95% of the observations of ROA are plus or minus 1.4 units, that's two standard deviations from the mean. The difference between the top 2.5% and bottom 2.5% is then 2.8 units. So top 2.5, bottom 2.5, four standard deviations, it's 2.8 units.

So what is the effect of strategic deviation? One standard deviation increase of strategic deviation is then 2.932 multiplied by -0.020 which equals -0.058 decrease in relative ROA. Then we compare is this -0.058, is it larger than the 2.8 units? So the full-scale is from the -2.5%, from the worst 2.5% to the best 2.5% is 2.8 units, and if you increase your strategic deviation by one standard deviation, you get -0.058 decrease in ROA. So it's a smallish effect.

We can also understand the effect size interpretation and how it's reported by looking at this nice example about sauna. So when we ask whether the sauna is warm or not, sauna is a Finnish thing, a normal research paper would say that the temperature of the sauna is statistically significantly different from normal room temperature. It tells us that maybe the sauna is heating, maybe it's ready for going in, maybe it's too hot, maybe it was on a day before and it's still cooling. It doesn't really tell us anything about whether the sauna is warm or not. And that's equivalent of saying that the effect of strategic deviation on ROA is negatively and statistically significantly different from zero. So the statistical significance just tells that there is some effect, it doesn't tell us whether the effect is large not. Then a slightly better answer is that the temperature of the sauna is currently 80 degrees and comparable that the effect of strategic deviation of ROA is -0.020. So that is useful for people who understand what 80 degrees mean and what this -0.020 means. So most people who go to sauna often know what 80 centigrade mean, but you can't assume that the readers of your research study will understand your units, so you have to explain what it means. So a really good answer to whether the sauna is hot is to say that the temperature is currently 80 and then tell that most people who go to the sauna regularly would say that the sauna is too hot but they could still do it. So that quantifies that the sauna is pretty hot, more so than just saying that it's 80 centigrade. The same thing, you can say that the effect of ROA is -0.20, and the difference between ROAs of top 25% and bottom 25%, 4 standard deviations, is -12. So if you go from the least deviant to the most deviant, it is 0.12, and the same scale for the ROA is 2.8. So we can see that 0.12 is pretty small compared to 2.8, so the effect is quite small. There are other things that you can do to improve your profitability than to be more statistically deviant.

Let's take a look at yet another example. So this is from Hekman's paper. And Hekman's paper shows a regression table and now, these effects are the number of patients in a panel, so how many people go to see a doctor is -0.04, and the age of the doctor is -0.13, the regression coefficients. Are these large effects or not? We would have to look at the correlation table and standard deviations and means to understand what are these large effects in a normal case. But this is actually not a normal case because these are standardized regression coefficients. They don't report it, but you can see it by comparing if you start to interpret this effect of the number of patients in the panel, which is in the thousands, and age, which is in the tens. You can see that the effect sizes don't make any sense. Also, all these effects are varied between -1 or 1, which is the typical range for a standardized regression coefficient. They can be more or less but they are typically zero point something or minus zero point something. So these are standardized coefficients, which means that the data have been standardized. So every variable has a standard deviation of 1 and a mean of 0 before regression estimation. In that case, we estimate this directly as standard deviations. One unit increase in physician productivity is associated with beta 1 increase in patient satisfaction. So we say that one standard deviation increase in physician productivity is associated with one standardized increase in satisfaction. So we interpret directly as standard deviations.

This looks like the way to do it always, so it would simplify life to always use standardized estimates, but that's actually not the case. I recommend that you never standardize a variable that has a meaningful scale. So if you have euros or years or something that makes sense to people as a unit, then don't standardize. The reason for that is that standardized estimates depend on the scale of the variables because the standard deviation is a sample standard deviation.

So let's say that here the standard deviation of age is 6.58 and the mean is 50.34, so the doctors are quite old. What would happen if the doctors in this sample were actually newly graduated, between 24 and 28, and the standard deviation would be 1? What would happen is that the standardized regression coefficient for the same effect would be only -0.02, which has a very different interpretation from -0.14. So it's 7 times as small, it's the exact same effect, it's just scaled differently. So the difference of scaling means that these effects, 0.02 and 0.40, are not comparable. So standardization doesn't make your results comparable. So if you can interpret the results without standardization, it is always better to do so.

So a rule of thumb, use standardization only if your variables, none of them, have a natural scale. Otherwise, interpret the standard deviations' units only for those variables for which a natural scale does not exist.

Tämä sisältö näytetään esikatselutilassa, suoritustasi ei tallenneta.

TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

Interpretation of regression results (25:12)

Opiskelijoille

Opettajille

Palvelusta