TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022
This course space end date is set to 06.04.2022 Search Courses: TU-L0022
Conceptual explanation of exploratory factor analysis (5:50)
Description to be added.
This video explains the conceptual
explanation of exploratory factor analysis. This video explains the
basics idea and steps needed in the conceptual explanation of
exploratory factor analysis.
Click to view transcript
Factor
analysis extracts underlying dimensions from the data and answers the
question of what indicators have in common. Sometimes factor analysis
nevertheless doesn't give you the solution that you expect and then you
must understand why that would happen. To do so you must understand what
exactly the factor analysis is doing. And in this video, I will provide
a conceptual explanation of exploratory factor analysis. The
idea of a factor analysis is that there are different variance
components in the data. If we have a measurement occasion, there is a
variance caused by the construct. We have indicators a1 a2 and a3 that
are supposedly valid measures of construct A. And we have b1 b2 and b3
that are supposedly valid measures of construct B. Then each indicator
also has this random noise unreliability and some unique aspects. If
these are survey questions - then the survey questions measure the
construct - they could measure something else and then there's
unreliability. In
factor analysis, we add a latent - one or more latent variables - to
this model. These are observed variables and we try to explain the
correlation between the observed variables by using a smaller number of
latent variables. For example, we add one factor here that we think
explains the inter-correlations between these items. And there were two
strategies: an exploratory analysis where we allow the computer to
specify the factors and confirmatory analysis where we specify the
factor structure ourselves. The
factor analysis model also - it's a statistical model - so it's a set
of equations and here is the model. We are saying that all these
indicators a1 a2 a3 b1 b2 b3 are a function of the factor times factor
loading for which we use the Greek letter lambda plus some error that we
don't observe. It's a regression equation basically. The only
difference is that we only observe the dependent variable. We don't
observe the key independent variable. This is a latent variable. If it
was an observed variable, then we could just regress all indicators on
the factor, but we can't because the factor is not observed. These were
the factor loadings, and these were the item uniqueness. It's important
to note that factor analysis cannot separate unreliability from some
other unique variance. If the a1 indicator has some unique aspects Q
then you cannot separate it from unreliability. And that's - basically
with any reliability statistic this applies. If
your indicators have a variation that is unique from other indicators,
but still reliable, so it's not random noise it's some variation - then
it cannot be distinguished from reliability. We assume that all variants
that can be explained by the other items or unique variance are
unreliability. That's the workaround for that limitation. We
had exploratory analysis and confirmatory analysis. The idea of
exploratory factor analysis is that the computer first gives - tries to
explain the data with one factor. It estimates a one-factor model - one
factor explains all correlations between the indicators. Then we
eliminate the variance explained by that factor from the data and then
we fit the same single factor model again on the residual variance and
we repeat this until there is no more covariance between indicators to
explain. What does the process look like? We have the data here. We have
this A variance here. Do they construct? B variants - do they
construct? And we want to know how much of the variation of these
indicators is due to the A construct and the B construct. We
first fit a single factor model- And let's say that the single factor
now picks up all the A variants. All the A variants go to the factor and
the remaining variance will go to the error term. We take apart - the
variation in the observed variables - we assign some to the factors and
some to the error terms. And this model doesn't fit well because these
errors are assumed to be uncorrelated, but we can see here that because
this error term takes the B variance and this as well, they are
correlated. The factor analysis wouldn't stop here because there is
evidence that there are still correlations after this factor. We take
the A variance, we put it aside here, and then we - form the remaining
data we fit another factor. It picks up the B variance here and the B
variance here and then the remaining indicators here are uncorrelated in
which case the factor analysis stops. That's
how factor analysis works. We pick up some variation then we continue
with the remaining we figure some variation until all the remaining
indicators are uncorrelated in which case the factor analysis has
discovered two factors. These two factors explain the inter-correlation
with the variables completely. The remaining variance in the data is a
simply unique feature of these indicators of unreliability. That's the conceptual idea. We extract variation then we do it over and over until there are no more covariances to extract.