Science Self-Concept – More Than the Sum of its Parts?

The article “Science Self-Concept – More Than the Sum of its Parts?” has now been published in “The Journal of Experimental Education” (btw in existence since 1932). The first 50 copies are free, in case you are interested.

In comparison to the preprint version, some substantial changes have been made to the final version of the manuscript, especially in the research questions and in the presentation of the results. Due to word restriction, we also removed a section from the discussion, in which we summarized differences and commonalities of the bifactor vs. higher-order models. We also speculated about why the type of modeling may also depend on the study’s subject, that is, on conceptual differences in intelligence vs. self-concept research. The argumentation may be a bit wonky, but at least I find the idea so persuasive that I want to reproduce it in the following. If you have any comments, please feel free to drop me a line.

Hierarchical vs. Bifactor Modeling

Reviewing the psychometric literature on hierarchical and bifactor modeling, one gets the impression that there are large statistical or conceptual differences between these modeling approaches. For example, Chen et al. (2006) listed several advantages of the bifactor model over the second-order model, but the differences are presumably more subtle (Gustafsson & Balke, 1993). Remember that the higher-order models can be turned into a special version of the bifactor models by means of the Schmid-Leiman-transformation (Reise et al., 2010, Schulze, 2004), that is, an (unconstrained) bifactor model and a higher-order model will only produce different results to the extent that the proportionality constraints are violated.

In our reading, the long debate about the appropriate modeling approach is blurred by the fact that the indicators of such models are often either parcels or subtests scores rather than items. In case of aggregated scores of different scales or subtests (e.g., Swedish and mathematics achievement as marker tests for crystallized intelligence in Gustafsson & Balke, 1993), often the bifactor model is preferred because the higher uniqueness of the indicators makes it hard to build a common trait in the higher-order model (see also Cucina & Byle, 2017). In case of parcels, the influence on modeling is more opaque (Cole, Perkins, & Zelkowitz, 2016), but parceling is often misused to mask heterogeneity by leveling out content differences (Little, Cunnigham, Shahar, & Widaman, 2002), which leads to an artificial homogenization of the indicators and generally weakens the subject-specific factors.

Compared to studies discussed in the psychometric literature on hierarchical vs. bifactor modeling, there are some differences in the present case. First, all models were estimated at the item level (with rather homogeneous sets of items), making it obsolete to aggregate the responses. Second, in contrast to research on cognitive abilities that relies on high interrelations (i.e., positive manifold, van der Maas et al. 2006), self-concept research has to deal with two opposing self-concepts—the verbal and the mathematical self-concept that are almost unrelated (Möller, Pohlmann, Köller, & Marsh, 2009). Also, self-concepts are only moderately correlated in the sciences (Jansen, Schroeders, & Lüdtke, 2014). These differences might have led to our result of the absence of significant differences between the bifactor and the second-order model, despite the large sample size. Thus, both models concur that the aggregated science self-concept and the subject-unspecific science self-concept are very highly correlated (r = .94). Therefore, one might be inclined to say that both measurements are equivalent, but this is not necessarily true.

That statistical unity is not to be confused with causal unity and that issues of measurement invariance have to be taken into account are two points we still discuss in the published version, to which we refer the interested reader.