Problem Statement: Computerized adaptive testing (CAT) is a sophisticated and efficient way of delivering examinations. In CAT, items for each examinee are selected from an item bank based on the examinee’s responses to the items. In this way, the difficulty level of the test is adjusted based on the examinee’s ability level. Instead of administering very long tests, CAT can estimate examinees’ ability levels with a small number of items. A number of operational testing programs have implemented CAT during the last decade. However, CAT hasn’t been applied to any operational test in Turkey, where there are several standardized assessments taken by millions of people every year. Therefore, this study investigates the applicability of CAT to a high-stakes test in Turkey.
Purpose of Study: The purpose of this study is to examine the applicability of CAT procedure to the Entrance Examination for Graduate Studies (EEGS), which is used in selecting students for graduate programs in Turkish universities.
Methods: In this study, post-hoc simulations were conducted using real responses from examinees. First, all items in EEGS were calibrated using the three-parameter item response theory (IRT) model. Then, ability estimates were obtained for all examinees. Using the item parameters and responses to EEGS, post-hoc simulations were run to estimate abilities in CAT. Expected A Posteriori (EAP) method was used for ability estimation. Test termination rule was standard error of measurement for estimated abilities.
Findings and Results:The results indicated that CAT provided accurateability estimates with fewer items compared to the paper-pencil format of EEGS. Correlations between ability estimates from CAT and the real administration of EEGS were found to be 0.93 or higher under all conditions. Average number of items given in CAT ranged from 9 to 22. The number of items given to the examinees could be reduced by up to 70%. Even with a high SEM termination criterion, CAT provided very reliable ability estimates. EAP was the best method among several ability estimates methods (e.g., MAP, MLE, etc.).
Conclusions and Recommendations: CAT can be useful in administering EEGS. With a large item bank, EEGS can be administered to examinees in a reliable and efficient way. The use of CAT can help to minimize the cost of the test since test booklets, examinee response sheets, etc. won’t be needed anymore. It can also help to prevent cheating during the test.
Keywords: Computerized adaptive testing, item response theory, standardized assessment, reliability.