Topic: Unit 3: Assumptions and diagnostics in linear regression models | TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

To complete this unit you need to

Watch the video lectures
Read the materials for the written assignment 2 (optional)
Complete the unit 3 discussion forum task
Return written assignment 2 (optional)
View the model answer for written assignment 2 and instructors comment's to written assignment 2 (optional)
Participate in the seminar (mandatory)
Participate in the computer class (optional)
Submit data analysis assignment 1 (mandatory)
Complete your caption assignments for unit 3 and all earlier units
Submit reflection and feedback for unit 3

The unit discusses assumptions and principles behind regression analysis. After this unit, you should

Understand that all statistical techniques make assumptions, of which some are empirically testable and others are not, and that some assumptions are more important than others.
Understand how log transformation can be applied to model non-linear, relative effects
Have a basic understanding of the concept of endogeneity and why it is a serious challenge for non-experimental research.
Have a basic understanding of how and why regression results can be visualized using marginal prediction plots.
Understand the relationship between linear model and correlation matrix and also understand why understanding this relationship is very useful when learning about linear models such as regression.

Select activity Choose online or in-person Unit 3 seminar participation

Choose online or in-person Unit 3 seminar participation Choice

Students must

Make a choice

Please indicate your preferred participation model to this seminar. The in-person seminar will be organized if there are at least two persons joining in.
Select activity Unit 3 discussion forum

Unit 3 discussion forum

Students must

Make forum posts: 1
Select activity Video lectures

Video lectures
Select activity Topic 1: Revisiting unit 2 concepts

Topic 1: Revisiting unit 2 concepts
Select activity NHST problems and controversies (16:18)

NHST problems and controversies (16:18) H5P

Students must

Receive a grade
Select activity Topic 2: More on the use of regression analysis

Topic 2: More on the use of regression analysis
Select activity Non-linear effects with log transformation (13:17)

Non-linear effects with log transformation (13:17) H5P

Students must

Receive a grade
Select activity Categorical independent variables (5:15)

Categorical independent variables (5:15) H5P

Students must

Receive a grade
Select activity Topic 3: Statistical tests after regression

Topic 3: Statistical tests after regression
Select activity Degrees of freedom (2:31)

Degrees of freedom (2:31) H5P

Students must

Receive a grade
Select activity Basic statistical tests (9:48)

Basic statistical tests (9:48) H5P

Students must

Receive a grade
Select activity Testing linear hypotheses after regression (8:57)

Testing linear hypotheses after regression (8:57) H5P

Students must

Receive a grade
Select activity Model comparisons (7:56)

Model comparisons (7:56) H5P

Students must

Receive a grade
Select activity Topic 4: Model implied correlation matrix and misu...

Topic 4: Model implied correlation matrix and misunderstandings of regression
Select activity Linear model implies a correlation matrix (17:34)

Linear model implies a correlation matrix (17:34) H5P

Students must

Receive a grade
Select activity Because your group is "Reader of quantitative rese...

Because your group is "Reader of quantitative research", a video about model implied covariance matrix will not be shown.
Select activity Linear model implies a covariance matrix (3:17)

Linear model implies a covariance matrix (3:17) H5P

Students must

Receive a grade

How to calculate a covariance matrices. This is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation.
Note: This video contains errors and will be re-recorded.

Click to view transcript

In this video, I will expand the previous video's principle to covariance matrices. A correlation matrix is a special case of the covariance matrix that has been scaled so that the variances of each variable are 1. So correlation matrix is kind of like a standardized version of a covariance matrix. Some features of linear models are better understood in covariance metrics, so understanding the same set of rules in covariance form is useful. Let's take a look at the covariance between X 1 and Y. We calculate the covariance X1 Y the same way as we calculated correlation. So we take the unstandardized regression coefficients here, so previously we were working with standardized regression coefficients, these are now unstandardized because we are working on the raw metric instead of the correlation metric. So we have X1 to Y 1 path. We get the beta 1 goes here.
Then another way of X1 to Y is to our travel 1 covariance X1 to X2 so that's covariance and then regression path. So we get that and then our X1 to X3, 1 covariance, and then to Y so that's all. We sum those together. That gives us the covariance between X1 and Y and that's the same math that we had in a correlation example but instead of working with correlations, we work with covariances. Things get more interesting when we look at what is the variance of Y. So the variance of Y is given by that equation here. So the idea is that we go from Y, and then we go to each source of variance of Y and then we come back. So we go from Y to X1, we take the variance of X1 and then we come back. So its variance of X1 times beta1 squared in the correlation metric we just take beta1 and beta1 squared because the variance in correlation matrix is one so we just ignore that.
When we go from Y to X1, X2 and beta2 then we get that here and we go it both ways. So why this is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation, so we get variation due to X covariance due to X1 and X2 we get variation due to the error term. So the variation of Y is the sum of all these variances and covariances of the explanatory variables, plus the variance of U the error term that is uncorrelated with all the explanatory variables. This covariance form of the model implied a correlation matrix rule is useful when you start working on more complicated models, such as confront factor analysis models.
Select activity Suppression in regression (7:10)

Suppression in regression (7:10) H5P

Students must

Receive a grade
Select activity Multicollinearity (19:37)

Multicollinearity (19:37) H5P

Students must

Receive a grade
Select activity Topic 5: Regression assumptions and diagnostics

Topic 5: Regression assumptions and diagnostics
Select activity Overview of the OLS assumptions (17:02)

Overview of the OLS assumptions (17:02) H5P

Students must

Receive a grade
Select activity Perfect collinearity of independent variables (3:31)

Perfect collinearity of independent variables (3:31) H5P

Students must

Receive a grade
Select activity Endogeneity and endogenous independent variables (10:56)

Endogeneity and endogenous independent variables (10:56) H5P

Students must

Receive a grade
Select activity Heteroskedasticity of error term (8:32)

Heteroskedasticity of error term (8:32) H5P

Students must

Receive a grade
Select activity Outliers (5:55)

Outliers (5:55) H5P

Students must

Receive a grade
Select activity Regression diagnostics and analysis workflow (17:47)

Regression diagnostics and analysis workflow (17:47) H5P

Students must

Receive a grade
Select activity Correct answers for Regression diagnostics and analysis workflow video tasks

Not available unless: You achieve higher than a certain score in Regression diagnostics and ... ...

Not available unless: You achieve higher than a certain score in Regression diagnostics and analysis workflow (17:47)

Correct answers for Regression diagnostics and analysis workflow video tasks File PDF
Select activity Added variable plot or partial regression plot (8:49)

Added variable plot or partial regression plot (8:49) H5P

Students must

Receive a grade
Select activity Caption assignments for Unit 3

Caption assignments for Unit 3 URL

Students must

Mark as done

Check that you have completed your captioning assignments for this unit and all previous units. This item will be marked as completed by the course staff when your captions have been reviewed by the course staff.
Select activity Assignments and model answersModel answers are sho...

Assignments and model answers
Model answers are shown only to students whose assignments have been graded.
Select activity Written assignment 2 (optional)

Written assignment 2 (optional) Turnitin Assignment 2

Students must

View

Receive a grade
Select activity Data analysis assignment 1 (mandatory)

Data analysis assignment 1 (mandatory) Turnitin Assignment 2

Students must

Receive a grade
Select activity Screencasts for data analysis assignment 1

Screencasts for data analysis assignment 1 Page
Select activity Description of the Prestige dataset

Description of the Prestige dataset File PDF
Select activity Reflection and feedback Unit 3Reflection is a key ...

Reflection and feedback Unit 3
Reflection is a key element of learning. At the end of the unit it is good time to look back at what you have learned, where you did well, and what you can still improve on. After you have completed all parts of the unit and received grades and feedback for all your submitted work, will in a short feedback form below.
Select activity Materials

Materials
Select activity Unit 3 slides

Unit 3 slides File PPTX
Select activity Flinga board 1: Unit overview

Flinga board 1: Unit overview URL
Select activity Flinga board 2: Correlations and descriptive statistics

Flinga board 2: Correlations and descriptive statistics URL
Select activity Additional resources

Additional resources
Select activity Simulation demonstration of regression assumptions

Simulation demonstration of regression assumptions File ZIP
Select activity Screencasts for simulation demonstration of regression assumptions

Screencasts for simulation demonstration of regression assumptions Page

MyCourses service break

TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

Topic outline

Unit 3: Assumptions and diagnostics in linear regression models

Video lectures

Topic 1: Revisiting unit 2 concepts

Topic 2: More on the use of regression analysis

Topic 3: Statistical tests after regression

Topic 4: Model implied correlation matrix and misunderstandings of regression

Topic 5: Regression assumptions and diagnostics

Assignments and model answers

Reflection and feedback Unit 3

Materials

Additional resources

Students

Teachers

About service