Kurssi: TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022, Aihe: Unit 3: Assumptions and diagnostics in linear regression models

To complete this unit you need to

Watch the video lectures
Read the materials for the written assignment 2 (optional)
Complete the unit 3 discussion forum task
Return written assignment 2 (optional)
View the model answer for written assignment 2 and instructors comment's to written assignment 2 (optional)
Participate in the seminar (mandatory)
Participate in the computer class (optional)
Submit data analysis assignment 1 (mandatory)
Complete your caption assignments for unit 3 and all earlier units
Submit reflection and feedback for unit 3

The unit discusses assumptions and principles behind regression analysis. After this unit, you should

Understand that all statistical techniques make assumptions, of which some are empirically testable and others are not, and that some assumptions are more important than others.
Understand how log transformation can be applied to model non-linear, relative effects
Have a basic understanding of the concept of endogeneity and why it is a serious challenge for non-experimental research.
Have a basic understanding of how and why regression results can be visualized using marginal prediction plots.
Understand the relationship between linear model and correlation matrix and also understand why understanding this relationship is very useful when learning about linear models such as regression.

Valitse aktiviteetti Choose online or in-person Unit 3 seminar participation

Choose online or in-person Unit 3 seminar participation Valinta

Opiskelijoiden täytyy

Tee valinta

Please indicate your preferred participation model to this seminar. The in-person seminar will be organized if there are at least two persons joining in.
Valitse aktiviteetti Unit 3 discussion forum

Unit 3 discussion forum Keskustelualue

Opiskelijoiden täytyy

Lähetä viestejä: 1
Valitse aktiviteetti Video lectures

Video lectures
Valitse aktiviteetti Topic 1: Revisiting unit 2 concepts

Topic 1: Revisiting unit 2 concepts
Valitse aktiviteetti NHST problems and controversies (16:18)

NHST problems and controversies (16:18) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Topic 2: More on the use of regression analysis

Topic 2: More on the use of regression analysis
Valitse aktiviteetti Non-linear effects with log transformation (13:17)

Non-linear effects with log transformation (13:17) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Categorical independent variables (5:15)

Categorical independent variables (5:15) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Topic 3: Statistical tests after regression

Topic 3: Statistical tests after regression
Valitse aktiviteetti Degrees of freedom (2:31)

Degrees of freedom (2:31) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Basic statistical tests (9:48)

Basic statistical tests (9:48) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Testing linear hypotheses after regression (8:57)

Testing linear hypotheses after regression (8:57) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Model comparisons (7:56)

Model comparisons (7:56) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Topic 4: Model implied correlation matrix and misu...

Topic 4: Model implied correlation matrix and misunderstandings of regression
Valitse aktiviteetti Linear model implies a correlation matrix (17:34)

Linear model implies a correlation matrix (17:34) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Because your group is "Reader of quantitative rese...

Because your group is "Reader of quantitative research", a video about model implied covariance matrix will not be shown.
Valitse aktiviteetti Linear model implies a covariance matrix (3:17)

Linear model implies a covariance matrix (3:17) H5P

Opiskelijoiden täytyy

Vaatii arvosanan

How to calculate a covariance matrices. This is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation.
Note: This video contains errors and will be re-recorded.

Click to view transcript

In this video, I will expand the previous video's principle to covariance matrices. A correlation matrix is a special case of the covariance matrix that has been scaled so that the variances of each variable are 1. So correlation matrix is kind of like a standardized version of a covariance matrix. Some features of linear models are better understood in covariance metrics, so understanding the same set of rules in covariance form is useful. Let's take a look at the covariance between X 1 and Y. We calculate the covariance X1 Y the same way as we calculated correlation. So we take the unstandardized regression coefficients here, so previously we were working with standardized regression coefficients, these are now unstandardized because we are working on the raw metric instead of the correlation metric. So we have X1 to Y 1 path. We get the beta 1 goes here.
Then another way of X1 to Y is to our travel 1 covariance X1 to X2 so that's covariance and then regression path. So we get that and then our X1 to X3, 1 covariance, and then to Y so that's all. We sum those together. That gives us the covariance between X1 and Y and that's the same math that we had in a correlation example but instead of working with correlations, we work with covariances. Things get more interesting when we look at what is the variance of Y. So the variance of Y is given by that equation here. So the idea is that we go from Y, and then we go to each source of variance of Y and then we come back. So we go from Y to X1, we take the variance of X1 and then we come back. So its variance of X1 times beta1 squared in the correlation metric we just take beta1 and beta1 squared because the variance in correlation matrix is one so we just ignore that.
When we go from Y to X1, X2 and beta2 then we get that here and we go it both ways. So why this is a useful rule is because it allows us to see that the variance of Y is a sum of all these different sources of variation, so we get variation due to X covariance due to X1 and X2 we get variation due to the error term. So the variation of Y is the sum of all these variances and covariances of the explanatory variables, plus the variance of U the error term that is uncorrelated with all the explanatory variables. This covariance form of the model implied a correlation matrix rule is useful when you start working on more complicated models, such as confront factor analysis models.
Valitse aktiviteetti Suppression in regression (7:10)

Suppression in regression (7:10) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Multicollinearity (19:37)

Multicollinearity (19:37) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Topic 5: Regression assumptions and diagnostics

Topic 5: Regression assumptions and diagnostics
Valitse aktiviteetti Overview of the OLS assumptions (17:02)

Overview of the OLS assumptions (17:02) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Perfect collinearity of independent variables (3:31)

Perfect collinearity of independent variables (3:31) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Endogeneity and endogenous independent variables (10:56)

Endogeneity and endogenous independent variables (10:56) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Heteroskedasticity of error term (8:32)

Heteroskedasticity of error term (8:32) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Outliers (5:55)

Outliers (5:55) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Regression diagnostics and analysis workflow (17:47)

Regression diagnostics and analysis workflow (17:47) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Correct answers for Regression diagnostics and analysis workflow video tasks

Saatavilla vasta, kun: Saavutat vaaditun arvosanan aktiviteetissa Regression diagnostics and ... ...

Saatavilla vasta, kun: Saavutat vaaditun arvosanan aktiviteetissa Regression diagnostics and analysis workflow (17:47)

Correct answers for Regression diagnostics and analysis workflow video tasks Tiedosto PDF
Valitse aktiviteetti Added variable plot or partial regression plot (8:49)

Added variable plot or partial regression plot (8:49) H5P

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Caption assignments for Unit 3

Caption assignments for Unit 3 Verkko-osoite

Opiskelijoiden täytyy

Merkitse tehdyksi

Check that you have completed your captioning assignments for this unit and all previous units. This item will be marked as completed by the course staff when your captions have been reviewed by the course staff.
Valitse aktiviteetti Assignments and model answersModel answers are sho...

Assignments and model answers
Model answers are shown only to students whose assignments have been graded.
Valitse aktiviteetti Written assignment 2 (optional)

Written assignment 2 (optional) Turnitin Tehtävä 2

Opiskelijoiden täytyy

Avaa

Vaatii arvosanan
Valitse aktiviteetti Data analysis assignment 1 (mandatory)

Data analysis assignment 1 (mandatory) Turnitin Tehtävä 2

Opiskelijoiden täytyy

Vaatii arvosanan
Valitse aktiviteetti Screencasts for data analysis assignment 1

Screencasts for data analysis assignment 1 Sivu
Valitse aktiviteetti Description of the Prestige dataset

Description of the Prestige dataset Tiedosto PDF
Valitse aktiviteetti Reflection and feedback Unit 3Reflection is a key ...

Reflection and feedback Unit 3
Reflection is a key element of learning. At the end of the unit it is good time to look back at what you have learned, where you did well, and what you can still improve on. After you have completed all parts of the unit and received grades and feedback for all your submitted work, will in a short feedback form below.
Valitse aktiviteetti Materials

Materials
Valitse aktiviteetti Unit 3 slides

Unit 3 slides Tiedosto PPTX
Valitse aktiviteetti Flinga board 1: Unit overview

Flinga board 1: Unit overview Verkko-osoite
Valitse aktiviteetti Flinga board 2: Correlations and descriptive statistics

Flinga board 2: Correlations and descriptive statistics Verkko-osoite
Valitse aktiviteetti Additional resources

Additional resources
Valitse aktiviteetti Simulation demonstration of regression assumptions

Simulation demonstration of regression assumptions Tiedosto ZIP
Valitse aktiviteetti Screencasts for simulation demonstration of regression assumptions

Screencasts for simulation demonstration of regression assumptions Sivu

MyCourses service break

TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

Osion kuvaus

Unit 3: Assumptions and diagnostics in linear regression models

Video lectures

Topic 1: Revisiting unit 2 concepts

Topic 2: More on the use of regression analysis

Topic 3: Statistical tests after regression

Topic 4: Model implied correlation matrix and misunderstandings of regression

Topic 5: Regression assumptions and diagnostics

Assignments and model answers

Reflection and feedback Unit 3

Materials

Additional resources

Opiskelijoille

Opettajille

Palvelusta