Purpose: In grading, one of the most common errors is made in combining two or more different test scores. This study aimed to investigate the agreement of grades calculated by weighting raw scores and standard scores.
Research Methods: In this simulation study, data were simulated for midterm and final measurements. Nine conditions [3 (class level: poor, average, good) x 3 (standard deviation (SD) difference (0, 10, and 20 units)] were considered. The sample size for each measurement was taken as 60 and the replication number set to 100. The weight for midterm and final measurements was respectively taken as 40% and 60%. The students’ norm-referenced grades were calculated in two ways: (1) based on T scores of weighted raw success scores (TWRSS) and (2) based on T scores of weighted standardized success scores (TWSSS). The agreement between TWRSS and TWSSS grades was calculated with the simple percentage agreement, extended (±1 grade) percentage agreement and kappa coefficient. The agreement between grades was compared by conducting two-way ANOVA.
Findings: The results showed that the SD main effect was a significant effect on the agreement between grades. The maximum agreement was provided when midterm and final measurements had equal SD. The minimum was provided when the SD difference was at the highest level.
Implications for Research and Practice: It was recommended that scores should be standardized before combined in the norm-referenced grading system. The effects of the shape of data (skewness or kurtosis) on norm-referenced grades could be investigated in the further studies.
Keywords: Standard scores, norm-referenced grading, agreement.