Problem: Practitioners working with multiple-choice tests have long utilized Item Response Theory (IRT) models to evaluate the performance of test items for quality assurance. The use of similar applications for performance tests, however, is often encumbered due to the challenges encountered in working with complicated data sets in which local calibrations alone provide a poor model fit.
Purpose: The purpose of this study was to investigate whether the item calibration process for a performance test, computer-based case simulations (CCS), taken from the United States Medical Licensing Examination® (USMLE®) Step 3® examination may be improved through explanatory IRT models. It was hypothesized that explanatory IRT may help improve data modeling for performance assessment tests by allowing important predictors to be added to a conventional IRT model, which are limited to item predictors alone.
Methods: The responses of 767 examinees from a six-item CCS test were modeled using the Partial Credit Model (PCM) and four explanatory model extensions, each incorporating one predictor variable of interest. Predictor variables were the examinees’ gender, the order in which examinees encountered an individual item (item sequence), the time it took each examinee to respond to each item (response time), and examinees’ ability score on the multiple-choice part of the examination.
Results: Results demonstrate a superior model fit for the explanatory PCM with examinee ability score from the multiple-choice portion of Step 3. Explanatory IRT model extensions might prove useful in complex performance assessment test settings where item calibrations are often problematic due to short tests and small samples.
Recommendations: Findings of this study have great value in practice and implications for researchers working with small or complicated response data. Explanatory IRT methodology not only provides a way to improve data modeling for performance assessment tests but also enhances the inferences made by allowing important person predictors to be incorporated into a conventional IRT model.
Keywords: Explanatory Item Response Theory, Partial Credit Model, Item Response Theory, Performance Tests, Item calibration, Ability estimation, Small tests