The Effects of Sample Size and Missing Data Rates on Generalizability Coefficients

May 1, 2018

Sumeyra SOYSAL¹, Haydar KARAMAN², Nuri DOGAN³
¹Corresponding author, Hacettepe University, TURKEY.
²Karamanoğlu Mehmet Bey University, TURKEY.
³Hacettepe University, TURKEY.
DOI: 10.14689/ejer.2018.75.10

ABSTRACT

Purpose of the Study: Missing data are a common problem encountered while implementing measurement instruments. Yet the extent to which reliability, validity, average discrimination and difficulty of the test results are affected by the missing data has not been studied much. Since it is inevitable that missing data have an impact on the psychometric properties of measurement instruments, it was considered necessary to investigate this topic. Depending on the identified need, a simulative study was conducted on the effects of missing data on reliability. The reliability estimates were discussed in terms of generalizability theory (G theory).

Research Methods: Depending on the research questions, complete data sets having different sample sizes (100, 200, 400, 1000) in weak and strong one-dimensional structures under normal distribution were produced. Missing data sets were created by deleting data at different rates (5%, 10%, 20%, 30%) randomly from the complete sets.

Findings and Results: When the estimates obtained by missing and complete data sets were compared, it was found that G and phi coefficients were significantly affected for the weak one-dimensional design when the missingness was 20% and more. However, for the strong one-dimensional design, those coefficients were negligibly affected even when the missingness was 30%. Moreover, it was also found that the estimates obtained by missing coded incorrect in particularly weak one-dimensional data were lower than the estimates from missing data matrix. Also error statistics of the weak one-dimensional data based on missing coded incorrect were significantly higher than their strong one-dimensional data counterparts, especially at the rates of 20% and 30% missingness.

Implications for Research and Practice: Thus, missing coded incorrect is not suggested to be used as a missing data treatment method in reliability estimations. Instead, generalizability theory, which allows us to conduct analysis with missing data in matrices, might be recommended.

Keywords: Reliability, G coefficient, phi coefficient, zero imputation, MCAR, generalizability theory, matrix of missing data.

Download PDF

Eurasian Journal of Educational Research