TU-L0022_aalto-CUR-141790-3063741: Principal component analysis (3:52)

Principal component analysis (3:52)

Få ett betyg

Principal component analysis is related to factor analysis and is commonly confused with it. In this video, principal component analysis is contrasted to factor analysis, and its usefulness in social science is discussed.

Click to view transcript

Principal component analysis is a statistical technique that is related to factor analysis and commonly confused with the factor analysis. What principal component analysis does it tries to summarize the variables into smaller set of sums - weighted sums of the variables called components. And it's a more data reduction technique concerned about how we can reduce the number of variables without deleting information from the data. It doesn't answer the question what do the indicators have in common - at least not directly.

It's not a very useful technique for assessing measurement models because in principal component analysis it considers all variance in the data. In factor analysis only the common variance is considered. What that means is that a principal component analysis also tries to explain the unreliability of the indicators whereas in factor analysis we try to take the unreliability and other unique aspects of the indicators and eliminate those so that we can extract what is common between the indicators. In practice if you use a factor loading as an estimate of indicator reliability - that is ok with some assumptions. If you use the component loading as an estimate of individual indicator reliability then reliability is severely overestimated.

The same thing if you apply so called Harman's single factor test to assess whether one factor can explain the intercorrelations in the data and that would be evidence of common method problem applying a component analysis instead of factor analysis will practically never indicate that you have a common method variance problem even if you actually do.

So this is not a substitute for a factor analysis. It's not a factor analysis technique and it's a data summary technique instead. It's not very useful one we work with measurement. So why do people use principal component analysis? The reason is that when you use SPSS and you do a factor analysis from the menu - you get the dialogue that looks like that. Then when you check on the factor extraction button here - it gives you different factor analysis techniques. So it can estimate the factor model in different ways. The default is to do a principal component analysis. And that's not a factor analysis technique. There are the others; whether you use principal axis factor in maximum likelihood or minimal residual - it doesn't matter but because they all estimate the factor analysis model. Principal component analysis is not a factor analysis model because it doesn't discover underlined dimensions instead it summarizes the data.

There are really no good reasons to use principal component analysis in social science research because a factor analysis can be used to summarize data. So if you just want to summarize your indicators with a smaller number of summed variables weighted sums - then factor analysis and principal component analysis will give you pretty similar solutions. If you want to assess whether underlying dimension explains the data - then factor analysis will give you the correct solution under certain assumptions - principal component analysis will not. So it's a good rule never to use principal component analysis in your own research and if you see someone using a principal component analysis or not recording which factor analysis technique they applied and using SPSS, then it's a good idea to question the authors choices.

Det här innehållet visas i förhandsgranskningsläge. Ingen spårning av försök kommer att lagras.

TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022

Principal component analysis (3:52)

Students

Teachers

Service