TU-L0022 - Statistical Research Methods D, Lecture, 2.11.2021-6.4.2022
Kurssiasetusten perusteella kurssi on päättynyt 06.04.2022 Etsi kursseja: TU-L0022
Formative measurement (15:12)
This videos covers the concept of formative measurement, the reason why its controversial and explains its proper application
Click to view transcript
Formative measurement is a controversial concept that nevertheless sees some applications in research. You need to understand this concept to understand why it's controversial and then you can make an informed decision of whether this is something that you should use or not, and also when you review work by others you will eventually encounter people who claim that they use formative indicators, formative measures or causal indicators.
So what is this concept about and why is there such controversy The normal measurement model that we use shows that the concept that we measure is a cause of the indicators. So the definition of validity that I'll be using in these videos says that an indicator is valid if the variation in the indicator is causally produced by the variation of the construct. So the idea is that the indicators vary because the construct varying. Then in formative measurement this idea is reversed. The idea is that measure causes the constructs or as a set of three measures for example calls together the concept that we're measuring.
What exactly does it mean that measure causes the construct? It's easy to understand how innovativeness of a company, for example, could cause some people to respond highly on a question about innovation and some people responding lowly about question about innovation. So what does it mean that this is reversed? The problem is that the literature doesn't really explain what it means that the measure causes the construct. If we take this literally then it would mean that when a CEO responds positively the question about the innovativeness then that causes the company to be innovative. That's clearly implausible.
Then another example that is commonly used is the socio-economic status. So for example, how people responds the questions about innovativeness how and education income and education and other things define their social economic status. How you respond to questions has certainly no causal effect on your socio-economic status. So there's controversy and some methodologies say that this idea should be abandoned all together and, for example, then we have these guys like Lee and Cadogan and Chamberlain who say that formative indicators are not measures at all, Hardin and Marcoulidis says that researchers should abandon this approach until we figure out the problems and then Edwards say that looking at the problem of formative measurement lead to the logical conclusion that the approach should be abandoned.
So what kind of problems we have in the idea that the indicators cause the construct and there can be some other causes of the concern as well? Let's take a look at first of how the advocates of these approach recommend that it's being used. Commonly when you read an article about formative measurement you see these kind of guidelines. So there are many guidelines type articles and that tell you when you should use these approaches and when you shouldn't. There are two basic rules of when you should apply formative measurement according to the advocates. The first one is that when you have a set of indicators how do you expect those indicators to be related as a set with the concept. So do you expect the concept to be higher when one of the indicators but not the others change or do you expect the indicators all to be higher when the concept changes?
So the idea of normal factor analysis based measurement model is that we have uni-dimensionality. So when we have a set of indicators that are supposed to measure innovativeness then for highly innovative companies we should expect answers to those questions to be on average always higher than for companies that are not so innovative. In formative measurement the idea is that these indicators represent different causes of different dimensions. For example, if we have socio-economic status that's measured with income and education, then we don't necessarily expect that the income and education covariant. So that's one. Should we expect the indicators to covary? In normal factor analysis based measurement model we do, in this formative measurement model we don't. So that's one rule of thumb for when the advocates assess that you should apply formative measurement.
Another one is an empirical test and one particularly commonly used test is called the vanishing tetrad test by Bollen and Ting. The idea here is that the factor analytical model says that the correlation structure should be so that all the indicators are positively correlated and there are certain constraints they should follow. And when the chi-square model, chi-square test rejects that model then the claim is that the reflective model, or the factor analytical model, is untenable for the data and therefore the formative model will remodel the latent variable as being caused by the indicators is more reasonable.
This is a logical fallacy. The reason why this is a logical fallacy is that rejection of one model does not imply the acceptance of another model. So it could be that your indicators are reflecting the concepts or the concept causes variation in the indicators but they're just bad measures. So because they're bad measures the model doesn't work. It doesn't imply that if you've bad measures you should just take a sum of those indicators and then ignore that your original model didn't work with those indicators. So the mental experiment has some merit. These empirical tests really don't.
Let's take a look at more why the idea of a formative measurement is troublesome. I'm not against taking indicators of taking multiple different indicators of different things and making an index. You have many different ways that combining things that don't really correlate as one number makes a lot of sense. For example, stock indices are made that way. The individual stocks correlate to some extent but they are not very highly correlated. Yet, taking a sum of these uncorrelated variables produces a very good measure of the overall stock market performance. So taking in this is taking sums is not a problematic. What is the problematic is the attachment of the idea of measurement into this sum indicators as an index. The idea of measurement was that we had the theoretical concept and the measurement result and they have some kind of relationship. So there must be some kind of statistical association between the measurement result and the theoretical concept. The traditional way is again thinking that the theoretical concepts variation causes variation in the measurement results and then we model it that way. We model latent variable representing the theoretical concept and then the arrows go towards the measurement results. So we take the measurement results here and then we build a statistical model based on those results. So we have three different things that we need to consider: we need to consider the theoretical concepts here, then the measurement results, and how we build a statistical model based on those measurement results. The idea here is that the statistical model should be a representation of the theoretical concept and if we represent a measurement relationship, so the measurement relationship is the relation between the measurement result and the thing being measured, then we call the resulting model measurement model.
So, statistical model here is a representation of the theoretical concept there. So what's the problem here with the formative measurement thing? Keith Markus is one of the people who don't really think this approach makes sense and his recent article goes over a couple of conceptual impediments for this discussion about formative measurement merits and weaknesses and then he states that one of the big problem here in the literature is that the literature of formative measurement is too much focused on the modelling part, so how do we construct models. It often makes sense to make indices out of indicators but just that sum indicator does not represent measurement. So there's a clearly clear distinction about how we model things and what does it mean to measure things and that's what he is saying.
So, I don't think that he agrees that no one is seriously saying that how you respond to questions about the income and education causes your socio-economic status. We can take a sum of those indicators and use that sum as a measure of socioeconomic status but that has really nothing to do about with causality and measurement. You're just taking aggregating things as a useful index. It's not measurement. So when we look again at this figure. The formative measurement model looks like that. So we specify that the indicators are freely correlated. We don't say that the construct represented by this latent variable is the cause of this indicator covariances and then we say that this model here: if this model is a valid measurement model, then it should be a good representation of the relation between a theoretical concept and the measure.
So, in practice we can't really defend the idea that the measurement results cause the concept so that's indefensible. If it was the case, we could easily manipulate social behaviour by just having people respond particular way in a survey instrument and then we would see an effect in reality. It just doesn't happen that way. So, the relationship between theoretical concept and measure is always from the theoretical concept to the measure and not the other way around. At least, there is no evidence that it will go the other way around.
And then this model if this is a good measurement model it should be a good representation of these causal relationships here and, for that reason calling, these models as measurement models formative models is misguided. It could be a useful model, so when we aggregate things as an index, it could be useful but it's not a model of measurement. So that's one of the key points of the opponents of this approach. Nobody is saying that you should never aggregate different dimensions into one index but simply that it is not about validation of measurement. So the idea of the label of formative measurement is one problematic thing. Saying that you construct an index and you are explaining why the thing index is useful, that's fairly unproblematic.
There are alternatives to formative measurement. So how do we model measurement? We could say that we have different indicators here. So we have three indicators and we say that these indicators x1, x2 and x3 are measured these three different latent variables that then caused this latent variable of interest. There is nothing wrong with this kind of model. So we are saying that these indicators x1, x2, x3 are valid because their variation is closely produced but the variation of the latent variables that we measure and then we are interested in the outcome of these three latent variables.There's nothing wrong with that statistically. That's nearly identical to the formative measurement model because we have to assume for identification that all these indicators x1, x2 and x3 are perfectly reliable. We cannot estimate their error variances. Of course this model can be extended. So we can add multiple indicators for each of these latent variables of interest that we say that are causes of these ultimate interest or latent variable. So that's a second-order model and that's a fairly defensible model. So you measure three different dimensions and then you measure each with parallel indicators or indicators that you assume to be parallel, then you make take a sum of those three latent variables justify why it makes sense and publish the people. Nobody is going to argue against that. The problem is that when we say that indicator causes construct then that's implausible.
So let's take a summary of formative measurement and this is something that you may fine useful when you review work done by others or if you consider using formative measurement models yourself. The first thing to understand it's the formative measurement does not exist, so indicators don't cause the concept. If you think that the indicator really causes a concept that's fairly easy to demonstrate with the experimental research. Just do a survey form, where you think that the indicators cause the construct then instruct some people to always answer on the left-hand side of the scale or other set of people always transfer the right-hand scale, if you randomized that's a valid experiment then wait one year and then measure the latent variable that is supposed to be caused by these indicators and see if there are differences between the groups. If you can find the difference that would be a huge finding. I don't think that anyone ever will because it's not a realistic idea.
Formative models or indices can be used. You can take sums of different variables. There are good reasons to do so. I'll go through those reasons in a different video. But the items that go to the index you have to validate them separately. So just that you take a sum of three different indicators has nothing to do about validation. We can sum things such as person's height and weight: it doesn't tell us whether those measures of height and weight are valid or reliable and the sum doesn't really make any sense anyway. Then you have to justify why the index is useful. So if you take a sum of people's height and weight what use would such index be? So it's an argument that is non statistical you have to explain why combining different things makes sense.