Age-related nuances in knowledge assessment - A modeling perspective

This is the second post in a series on a recent paper entitled “Age-related nuances in knowledge assessment” that we wrote with Luc Watrin and Oliver Wilhelm. The first post dealt with the way how we conceptualize the organization of knowledge in a hierarchy in a multidimensional knowledge space. The second post reflects on the way we measure or model knowledge. In textbooks knowledge assessments have a special standing, because they can be modeled both from a reflective and a formative perspective.

The reflective model is considered the genuine model of the psychological world, in which constructs represent dispositions (e.g., intelligence, motivation) that cannot be measured directly. Items are then indicators of a latent ability, which serves as a common cause influencing participants’ responses to the indicators (Markus & Borsboom, 2013). In the context of knowledge assessment, this corresponds to the view that the latent ability fully accounts for the correlations among individual knowledge items, while their unique variance reflects sources of random errors (unless the model allows for domain-, area-, or item-specific variance). If someone is more knowledgeable than someone other, this should generate a transfer effect of knowledge. This has also become known as the “rich-get-richer effect” of knowledge acquisition and fueled the “knowledge-is-power hypothesis” (e.g., Hambrick & Engle, 2002; Witherby & Carpenter, 2021). In narrow knowledge such as chess (Chase & Simon, 1973), baseball (Hambrick & Engle, 2002), football or cooking (Witherby & Carpenter, 2021) prior knowledge can indeed be essential (and even causal) for the acquisition of new knowledge (see also Watrin et al., 2021, for more details on this topic). Recently, Savi et al. (2019, p. 1052) proposed an idiographic explanation for this mutualism, in which intelligence is conceptualized as an evolving network. The question that arises is whether such transfer effects are also detectable when knowledge is only loosely connected, as in a general knowledge test.

In formative models, potential indicators determine together an emerging composite variable (van der Maas et al., 2014). Thus, indicators cannot be used interchangeably; they are bound to the construct they intend to measure. In this view, knowledge is exhaustively represented by its indicators. From the perspective of a psychologists, the formative approach is somewhat odd and, thus, has been criticized for substantial shortcomings, such as measurement without error or the fact that a formatively measured construct does not exist independently of its indicators. But for a point-to-point rebuttal of these points of criticism please see Bollen and Diamantopoulos (2017). Examples for such finite item sets are driving or pilot license tests. In these tests, knowledge has been canonized or, differently put, the elements of a test are mutually tied to the specific construct it measures. In most cases, however, knowledge in a given field is not defined or narrowly circumscribed , that is, there is no exhaustive list of indicators that belong to a domain. Accordingly, gc cannot be understood as a fixed quantity assessed with a limited pool of knowledge items.

So, what is the correct way to model knowledge? The answer depends on the measurement intention. Let me explain: In general, the purpose of any scientific model is either to explain observed data (i.e., explanatory modeling) or to predict data accurately (predictive modeling), whereby the former seems to be much more popular in psychology (Shmueli, 2010). In our opinion, when it comes to prediction, both the formative and the reflective modeling approaches fall short because they do not strive for indicators (sampled or not) to go beyond the aggregate scores. A reflective model does not pursue it due to its implication that the covariance between any two indicators is (only randomly different from) zero once it is controlled for the latent factor(s). Therefore, such modeling always leads to homogenization of the item pool. The formative model presupposes that the sampled items are a specific and fixed set of indicators that represent the construct in question. It is not admissible to add items, specifically if they modify relations with external variables. But this is by no means a sobering conclusion, it simply points to the fact that item sampling issues in knowledge assessment do matter. Unfortunately, such construct-relevant variance on the item level is understudied and often neglected.

tl;dr: Predictive modeling is neither compatible with formative nor reflective modeling.