Tagged "measurement invariance"

Method-Toolbox

Sample size estimation in Item Response Theory

Although Item Response Theory (IRT) models offer well-established psychometric advantages over traditional scoring methods, they have been largely confined to specific areas of psychology, such as educational assessment and personnel selection, while their broader potential remains underutilized in practice. One reason for this is the challenge of meeting the (presumed) larger sample size requirements, especially in complex measurement designs. Accurate a priori sample size estimation is essential for obtaining accurate estimates of item/person parameters, effects, and model fit. As such, it serves as an essential tool for effective study planning, especially in pre-registration and registered reports.

Science self-concept – More than the sum of its parts?

The article “Science Self-Concept – More Than the Sum of its Parts?” has now been published in “The Journal of Experimental Education” (btw in existence since 1932). The first 50 copies are free, in case you are interested.

In comparison to the preprint version, some substantial changes have been made to the final version of the manuscript, especially in the research questions and in the presentation of the results. Due to word restriction, we also removed a section from the discussion, in which we summarized differences and commonalities of the bifactor vs. higher-order models. We also speculated about why the type of modeling may also depend on the study’s subject, that is, on conceptual differences in intelligence vs. self-concept research. The argumentation may be a bit wonky, but at least I find the idea so persuasive that I want to reproduce it in the following. If you have any comments, please feel free to drop me a line.

Testing for equivalence of test data across media

In 2009, I wrote a small chapter that was part of an EU conference book on the transition to computer-based assessment. Now and then I’m coming back to this piece of work - in my teaching and my publications (e.g., the EJPA paper on testing reasoning ability across different devices). Now I want to make it publically available. Hopefully, it will be interesting to some of you. The chapter is the (unaltered) preprint version of the book chapter, so if you want to cite it, please use the following citation:

Pitfalls in measurement invariance testing

In a new paper in the European Journal of Psychological Assessment, Timo Gnambs and I examined the soundness of reporting measurement invariance (MI) testing in the context of multigroup confirmatory factor analysis (MGCFA). Of course, there are several good primers on MI testing (e.g., Cheung & Rensvold, 2002; Wicherts & Dolan, 2010) and textbooks that elaborate on the theoretical base (e.g., Millsap, 2011), but a clearly written tutorial with example syntax how to implement MI practically was still missing. In the first part of the paper, we demonstrate that a sobering large amount of reported degrees of freedom (df) do not match with the df recalculated based on information given in the articles. More specifically, we both reviewed 128 studies including 302 measurement invariance MGCFA testing procedures from six leading peer-reviewed journals that focus on psychological assessment and on a regular base. Overall, about a quarter of all articles included at least one discrepancy with some systematic differences between the journals. However, it was interesting to see that the metric and scalar step of invariance testing were more frequently affected.

Longitudinal measurement invariance testing with categorical data

In a recent paper – Edossa, Schroeders, Weinert, & Artelt, 2018 – we came across the issue of longitudinal measurement invariance testing with categorical data. There are quite good primers and textbooks on longitudinal measurement invariance testing with continuous data (e.g., Geiser, 2013). However, at the time of writing the manuscript there wasn’t an application of measurement invariance testing in the longitudinal run with categorical data. In case your are interest in using such an invariance testing procedure, we uploaded the R syntax for all measurement invariance steps.

Equivalence of screen versus print reading comprehension depends on task complexity and proficiency

Reference. Lenhard, W., Schroeders, U., & Lenhard, A. (2017). Equivalence of screen versus print reading comprehension depends on task complexity and proficiency. Discourse Processes, 54(5-6), 427–445. doi: https://doi.org/10.1080/0163853X.2017.1319653

Abstract. As reading and reading assessment become increasingly implemented on electronic devices, the question arises whether reading on screen is comparable with reading on paper. To examine potential differences, we studied reading processes on different proficiency and complexity levels. Specifically, we used data from the standardization sample of the German reading comprehension test ELFE II (n = 2,807), which assesses reading at word, sentence, and text level with separate speeded subtests. Children from grades 1 to 6 completed either a test version on paper or via computer under time constraints. In general, children in the screen condition worked faster but at the expense of accuracy. This difference was more pronounced for younger children and at the word level. Based on our results, we suggest that remedial education and interventions for younger children using computer-based approaches should likewise foster speed and accuracy in a balanced way.