TU-L0022_aalto-CUR-166088-3086795: Non-linear effects with log transformation (13:17)

Etusivu Koulut Kurssipalaute Palvelulinkit Intelliboard

Kurssiasetusten perusteella kurssi on päättynyt 29.03.2023 Etsi kursseja: TU-L0022

Non-linear effects with log transformation (13:17)

Vaatii arvosanan

Description to be added.

This video explains the different relationships between one dependent variable and one or more independent variables with the use of non-linear effects with log transformation technique. This video introduces and goes through the use of log transformation technique.

Click to view transcript

Regression analysis tells us about the relationship between one dependent variable and one or more independent variables. One of the problems with regression analysis or one of the limitations is that it focuses on linear relationships only. However, many relationships in nature and social life are nonlinear in nature. And one very useful technique for dealing with that kind of relationship is, the log transformation or logarithm transformation if we write the log in a long form.

What does that do, what does log transformation do? Many papers contain statements like this. We use the log of the revenue since revenue for our firms is highly skewed. That's very common, the researchers say that something is skewed, and we take a log of something to make it more normal. That has a couple of issues, that kind of statement. But let's first look at what log transformation does to address skewness.

These are the data from the largest 500 Finnish companies in 2005, the revenues for those companies. We have one very large company here, then some companies here and most are here around a few hundred million euros of revenue. We have a couple of billion-euro companies, and most companies are in the hundreds of millions range. This distribution is highly skewed, it means that there is this long tail here, so we have, most observations are clustered here, and then we have some that go to this long tail here. This kind of skewed distribution is sometimes problematic, but we must understand that, for example, regression analysis makes no assumptions about, how observed variables are distributed. It makes some assumptions, but the distribution of observed variables is not one of those.

If we take a logarithm of this, every revenue here, we get the distribution that looks like that. We get something that doesn't have as a long tail as before, so now the observations are more closely clustered around the mean, there is still some tail here, but not as severe. These units here are now logarithms. I'm using base 10 here for ease of exposition but normally we use the natural logarithm, it doesn't really make a difference for your analysis. This is the 100 million thresholds, this is the 1 billion thresholds, then we have 10 billion and then 100 billion thresholds here. We change the scale of the variable by taking a logarithm.

What does the logarithm transformation do? It changes the shape of the distribution, so this is highly skewed, this is still skewed but less so. In some cases, it reduces the skewness of data, but that's not the reason why we actually use it. We don't need our data to be normal but instead sometimes thinking in terms of relative units makes a lot more sense than thinking in terms of absolute units. Absolute units here mean that the difference between 0 and 1 billion is the same as 1 billion and 2 billion.

Let's think for a while, does it make sense to say that when a company grows to 0 to 1 billion is it the same kind of transformation for the company as when it grows from 1 billion to two billion? No, that doesn't make any sense. Also, companies nearly don't say that we grew this in this many euros, instead, we grew by 10% or 15% compared to the previous year's revenue. Quite often we like to compare things in relative terms. You get your salary increases based on labor union negotiations, they are hardly ever fixed euro amounts, they are 1 % - 2 %, something related to your current salary level. They are relative units. Here the relative units mean that the difference between 1 billion, or 100 million and 1 billion is relatively the same as the difference between 1 billion and 10 billion. Each space between these two ticks doesn’t refer to unit increase, instead, it refers to a tenfold increase. Things increase relative to the previous level.

Let's take a look at, what it means to run a regression analysis with log transformation, and why would we want to do that? Transforming the variables to be less skewed is not the right reason to use log transformation and if you want to reduce skewness, you, of course, can do log transformation, but you have to understand that there are other more important reasons to use log transformation and it also influences how you interpret your results. This is the example data set from the Prestige data set, these are occupations from the Canada census of 1930-70-something. And we have the prestige score of occupation and then the average income of an occupation. We're interested in learning; how much income depends on prestige. We can see that there is a linear effect here, prestige goes from 20 to 80, and first income increases, and then it starts to increase in a nonlinear fashion. If we were to draw a line or a curve, it would first go flat and then it would curve up a bit. The line here is not the best description of the data. We can see here that these observations are below the regression line, and these are above the regression line. Instead of fitting a line, fitting some kind of curve that bends up would be better, something like that.

Instead of saying that these are characterized by a line, we say that these observations are characterized by this blue curve here. And that is, what the log transformation does for us and it's the important reason why we use it. Instead of saying that income increases as a constant function of prestige, we say that income increases as a relative function to the current level of income, as a function of prestige. Let's take a log transformation of income and run a regression analysis.

Here's my regression analysis. This is the income, done with R, using this data. We can see the one unit increase in prestige leads to 176 Canadian dollars more per year, and then when we have a log of income, then log of income increases by 0.03, for every additional unit of prestige. The problem with this, we know that the log first has a slightly higher R-squared and also slightly higher adjusted R-squared, than the income. Based on that metric, we can make an informed judgment that this is could be a better model. It's not certain that a better or a higher R-squared means that it's a better model, but it could be. How we judge models will come up later in videos.

How do we interpret? What does this 0.03 increase in the log of revenue, log of income mean? For most people, the metric of a log of income doesn’t have any meaning. Someone tells me that the logarithm of your income will increase by 0.01, I know what it means because I've done this, I’ve read my statistics books, most people don't. How do we interpret? There are two ways of interpreting the log transformation results. One is the general way of interpreting any nonlinear effects, and that is plotting. You can do this, here are the regression results for the log transformation model. What we do here is that we calculate the fitted values of the logarithm of income based on prestige. This is simply taking the formula, adding intercept 7.46 plus 0.02 times 20. That provides us with the fitted income. And the hat here denotes that this is a fitted value from the regression analysis. Then we take exponentials of these incomes. When you take a logarithm of a number, you get another number. When you apply exponential to that other number, you get back your original number. We say that the exponential is the inverse function of a logarithm, and logarithm is the inverse function of an exponential. Because we can apply 1 to get back the original number, that was used as an input for the other. Exponential transformation allows us to kind of undo the log transformation, and we get these predicted incomes for each prestigious level.

Then we plot the data, so we plot these exponentiated logs or predicting logs of income here, and as a function of prestige, we get this curve. Whenever you don't know, how to interpret a particular regression estimate that has been calculated based on some transformation. One very good way of doing that is to plot the effect. You can also plot the linear model effects only and then you can compare, which one looks more reasonable. Here the blue curve, the log-transformed results, look a lot more reasonable explanation for the data than the red line. That is one way, the general way that you can interpret any nonlinear effects. And this kind of plot, where you draw a line, it’s called a marginal prediction plot. We will cover this later in the course.

Another way of interpreting regression analysis results after log transformation is to interpret them directly. Log transformation is a special case of transformations because it has a natural interpretation. These interpretations are given by Wooldridge's book here. When we take the log of the dependent variable then each of these regression coefficients, here only for prestige, change their meaning. The meaning of this unit increase of prestige is translated to relative increase. Beta1 of prestige here, doesn’t tell us, what is the unit increase of prestige, what is that's the effect on income? Instead, it tells, what is the effect of one unit increase of prestige on the relative income. If the regression coefficient of prestige is 0.025, like it's here, then it means that one unit increase in prestige leads to a 2.5 % increase in salary, compared to a current salary level. It's an exponential growth model, that’s why we use the exponential function. Every time your prestige of occupation increases by one, then your salary goes up 2.5 % compared to the previous level. Calculating, how much for example ten units would mean, could be a bit difficult because we have to take 2 % and then apply that ten times, 2.5 %. So it's a 0.025 to the power of 10 and then you will get the effect of a 10 unit increase of prestige. In practice, your statistical software will do the calculations of the marginal effect for you.

Doing a plot like that would simplify the interpretation because you can see directly, what is the effect of moving from prestige of 40 to prestige of 60 by taking the line. Also, the software will give you the numbers behind these plots. That's how you calculate marginal effects. The actual calculation is covered in a different video.

Tämä sisältö näytetään esikatselutilassa, suoritustasi ei tallenneta.

TU-L0022 - Statistical Research Methods D, Lecture, 25.10.2022-29.3.2023

Non-linear effects with log transformation (13:17)

Opiskelijoille

Opettajille

Palvelusta