TU-L0022 - Statistical Research Methods D, Lecture, 25.10.2022-29.3.2023
Kurssiasetusten perusteella kurssi on päättynyt 29.03.2023 Etsi kursseja: TU-L0022
Osion kuvaus
-
To complete this unit you need to
- Watch the video lectures
- Read the materials for the written assignment 2 (optional)
- Complete the unit 3 discussion forum task
- Return written assignment 2 (optional)
- View the model answer for written assignment 2 and instructors comment's to written assignment 2 (optional)
- Participate in the seminar (mandatory)
- Participate in the computer class (optional)
- Submit data analysis assignment 1 (mandatory)
- Complete your caption assignments for unit 3 and all earlier units
- Submit reflection and feedback for unit 3
The unit discusses assumptions and principles behind regression analysis. After this unit, you should
- Understand that all statistical techniques make assumptions, of which some are empirically testable and others are not, and that some assumptions are more important than others.
- Understand how log transformation can be applied to model non-linear, relative effects
- Have a basic understanding of the concept of endogeneity and why it is a serious challenge for non-experimental research.
- Have a basic understanding of how and why regression results can be visualized using marginal prediction plots.
- Understand the relationship between linear model and correlation matrix and also understand why understanding this relationship is very useful when learning about linear models such as regression.
-
How to calculate a covariance matrices. This is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation.
Note: This video contains errors and will be re-recorded.
Click to view transcript
In this video, I will expand the previous video's principle to covariance matrices. A correlation matrix is a special case of the covariance matrix that has been scaled so that the variances of each variable are 1. So correlation matrix is kind of like a standardized version of a covariance matrix. Some features of linear models are better understood in covariance metrics, so understanding the same set of rules in covariance form is useful. Let's take a look at the covariance between X 1 and Y. We calculate the covariance X1 Y the same way as we calculated correlation. So we take the unstandardized regression coefficients here, so previously we were working with standardized regression coefficients, these are now unstandardized because we are working on the raw metric instead of the correlation metric. So we have X1 to Y 1 path. We get the beta 1 goes here.
Then another way of X1 to Y is to our travel 1 covariance X1 to X2 so that's covariance and then regression path. So we get that and then our X1 to X3, 1 covariance, and then to Y so that's all. We sum those together. That gives us the covariance between X1 and Y and that's the same math that we had in a correlation example but instead of working with correlations, we work with covariances. Things get more interesting when we look at what is the variance of Y. So the variance of Y is given by that equation here. So the idea is that we go from Y, and then we go to each source of variance of Y and then we come back. So we go from Y to X1, we take the variance of X1 and then we come back. So its variance of X1 times beta1 squared in the correlation metric we just take beta1 and beta1 squared because the variance in correlation matrix is one so we just ignore that.
When we go from Y to X1, X2 and beta2 then we get that here and we go it both ways. So why this is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation, so we get variation due to X covariance due to X1 and X2 we get variation due to the error term. So the variation of Y is the sum of all these variances and covariances of the explanatory variables, plus the variance of U the error term that is uncorrelated with all the explanatory variables. This covariance form of the model implied a correlation matrix rule is useful when you start working on more complicated models, such as confront factor analysis models.
-
Correct answers for Regression diagnostics and analysis workflow video tasks Tiedosto PDF
-
Model answer for data analysis assignment 1 (Stata) Tiedosto