Tagged "assessment"

Research

Technology-Based Assessment

In the last decades, the digitalization of educational content, the integration of computers in different educational settings and the opportunity to connect knowledge and people via the Internet has led to fundamental changes in the way we gather, process, and evaluate information. Also, more and more tablet PCs or notebooks are used in schools and—in comparison to traditional sources of information such as text books—the Internet seems to be more appealing, versatile, and accessible. Technology-based assessment has been concerned with questions of comparability of test scores across test media, transferring already existing measurement instruments to digital devices. Nowadays, researchers are more interested in enriching the assessment by using interactive tasks and video material or make the testing more efficient using digital behavior traces.

Bee Swarm Optimization (BSO)

“Bees are amazing, little creatures” (Richardson, 2017) – I agree. Bees have fascinated people since time immemorial, and yet even today there are still novel and fascinating discoveries (see the PLOS collection for some mind-boggling facts). Although bees as an insect species might seem as the prime example of state-building insects, highly social forms of community are the exception among bees. The large majority of all bee species are solitary bees or cuckoo bees that do not form insect states.

Tests-Questionnaires

A 120 item gc test

This is a 120 item measure of crystallized intelligence (gc), more precisely, declarative knowledge. Based on previous findings concerning the dimensionality of gc (Steger et al., 2019), we sampled items from four broad knowledge areas - humanities, life sciences, natural sciences, and social sciences. Each knowledge area contained three domains with ten items each, resulting in a total of 120 items. Items were selected to have a wide range of difficulty and to broadly and deeply cover the content domain. Items are available in German and English. The development of the inital item pool is detailed in Steger et al. (2019). We used and described the 120 item gc measure in two recent publications (Schroeders et al., 2021; Watrin et al., 2021). The items can be found in the accompanying OSF project.

New methods and assessment approaches in intelligence research

Maybe you have seen my recent Tweet:

And this is the complete Call for the Special Issue in the Journal of Intelligence

Dear Colleagues,
Our understanding of intelligence has been—and still is—significantly influenced by the development and application of new computational and statistical methods, as well as novel testing procedures. In science, methodological developments typically follow new theoretical ideas. In contrast, great breakthroughs in intelligence research followed the reverse order. For instance, the once-novel factor analytic tools preceded and facilitated new theoretical ideas such as the theory of multiple group factors of intelligence. Therefore, the way we assess and analyze intelligent behavior also shapes the way we think about intelligence.
We want to summarize recent and ongoing methodological advances inspiring intelligence research and facilitating thinking about new theoretical perspectives. This Special Issue will include contributions that:

Meta-analysis proctored vs. unproctored assessment

Our meta-analysis – Steger, Schroeders, & Gnambs (2018) – comparing test-scores of proctored vs. unproctored assessment is now available as online first publication and sometime in the future to be published in the European Journal of Psychological Assessment. In more detail, we examined mean score differences and correlations between both assessment contexts with a three-level random-effects meta-analysis based on 49 studies with 109 effect sizes. We think this is a timely topic since web-based assessments are frequently compromised by a lack of control over the participants’ test-taking behavior, but researchers are nevertheless in the need to compare the data obtained through unproctored test conditions with data from controlled settings. The inevitable question is to what extent such a comparison is feasible.

Recalculating df in MGCFA testing


Plase cite as follows:
Schroeders, U., & Gnambs, T. (2020). Degrees of freedom in multigroup confirmatory factor analyses: Are models of measurement invariance testing correctly specified? European Journal of Psychological Assessment, 36(1), 105–113. https://doi.org/10.1027/1015-5759/a000500

A. Number of indicators
C. Number of cross-loadings
E. Number of resid. covar.
B. Number of factors
D. Number of ortho. factors
F. Number of groups



MI testing constraints df comparison delta(df)
config. (item:factor) 0 - -
metric (loadings) 2 metric-config 2
scalar (loadings+intercepts) 4 scalar-metric 2
residual (loadings+residuals) 5 residual-metric 3
strict (loadings+intercepts+residuals) 7 strict-scalar 3

Additional information

A Indicates the number of indicators or items.
B Indicates the number of latent variables or factors.
C Indicates the number of cross-loadings. For example, in case of a bifactor model the number equals twice the number of indicators (A).
D Indicates the number of orthogonal factors. For example, in case of a nested factor model with six indicators loading on a common factor and three items additionally loading on a nested factors, you have to specify 2 factors (B) and 1 orthogonal factor (D).
E Indicates the number of residual covariances.
F Indicates the number of groups.

Further reading

  • Beaujean, A. A. (2014). Latent variable modeling using R: a step by step guide. New York: Routledge/Taylor & Francis Group.
  • Millsap, R. E. & Olivera-Aguilar, M. (2012). Investigating measurement invariance using confirmatory factor analysis. In R. H. Hoyle (Ed.), Handbook of Structural Equation Modeling (pp. 380-392). New York: Guilford Press.
  • Kline, R. B. (2011). Principles and practice of structural equation modeling. New York: Guilford Press.

The Rosenberg Self-Esteem Scale - A drosophila melanogaster of psychological assessment

I had the great chance to co-author two recent publications of Timo Gnambs, both dealing with the Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965). As a reminder, the RSES is a popular ten item self-report instrument measuring a respondent’s global self-worth and self-respect. But basically both papers are not about the RSES per se, rather they are applications of two recently introduced powerful and flexible extensions of the Structural Equation Modeling (SEM) Framework: Meta-Analytic Structural Equation Modeling (MASEM) and Local Weighted Structural Equation Modeling (LSEM), which will be described in more detail later on.

Equivalence of screen versus print reading comprehension depends on task complexity and proficiency

Reference. Lenhard, W., Schroeders, U., & Lenhard, A. (2017). Equivalence of screen versus print reading comprehension depends on task complexity and proficiency. Discourse Processes, 54(5-6), 427–445. doi: https://doi.org/10.1080/0163853X.2017.1319653

Abstract. As reading and reading assessment become increasingly implemented on electronic devices, the question arises whether reading on screen is comparable with reading on paper. To examine potential differences, we studied reading processes on different proficiency and complexity levels. Specifically, we used data from the standardization sample of the German reading comprehension test ELFE II (n = 2,807), which assesses reading at word, sentence, and text level with separate speeded subtests. Children from grades 1 to 6 completed either a test version on paper or via computer under time constraints. In general, children in the screen condition worked faster but at the expense of accuracy. This difference was more pronounced for younger children and at the word level. Based on our results, we suggest that remedial education and interventions for younger children using computer-based approaches should likewise foster speed and accuracy in a balanced way.