Tagged "psychometrics"

Example 1: Piloting Testing with a Linked Test Design

Objective

The example demonstrates how to determine the sample size required to estimate the item difficulties of a one-parametric item response model with a given precision. In the study, two test versions, A and B, are administered, each containing 18 items. Twelve items are unique to each test version, while six items are common to both test versions. The parameter of interest is the Mean Squared Error (MSE) of the item difficulty parameters in a one-parametric item response model.

Example 2: Test Validation With Randomized Item Sampling

Objective

The example describes the validation of a newly developed computerized personality test with a forced-choice response format comparable to the Eysenck Personality Inventory (e.g., “Do you prefer reading to going out?”, yes/no). The test is assumed to contain 30 items. A random sample of the personality test items is drawn and the correlation with an external metric criterion (e.g., number of Facebook friends) is estimated. The parameter of interest is the standard error of the correlation between the latent trait and the criterion, which depends on both the sample size and the amount of missingness.

Example 3: Conditional Reliabilities of Three Measures

Objective

The example demonstrates how to identify the sample size required to estimate the conditional reliability of a test using the graded response model (GRM; Samejima, 1969) with a given precision. It is assumed that respondents are randomly administered two out of three depression instruments, that is, the 21-item Beck’s Depression Inventory-II (BDI-II), the 20-item Center for Epidemiological Studies Depression Scale (CES-D), and 9-item the Patient Health Questionnaire (PHQ). These instruments are intended to screen patients for clinically relevant levels of depression. Therefore, the focus is on the measurement precision, that is conditional reliability, at two standard deviations above the mean.