Test Items Analysis of Mathematical Problem Solving Ability using a Classical Test Theory Approach
 ),   						Edi Istiyono(2),   						Widihastuti Widihastuti(3)     
  
             |
),   						Edi Istiyono(2),   						Widihastuti Widihastuti(3)     
  
             |  Country:
Country: 
                        
                              
                              
            
   
(1) STKIP YPUP Makassar, Indonesia
(2) STKIP YPUP Makassar, Indonesia
(3) Yogyakarta State University, Indonesia
 Corresponding Author
Corresponding Author
This study aims to analyze the item characteristics of the mathematics problem-solving ability test instrument using the Classical Test Theory model. The data collection was based on the results of the test documentation as many as 359 students with dichotomous data. Qualitative validation analysis by experts uses the panel method, and quantitative analysis uses the Aiken index and Content Validity Ratio (CVR), while quantitative validation uses biserial point correlation. The test reliability index used the Alpha-Cronbach method. The results of qualitative validation show that all items correspond to the indicators of solving mathematical problems, but technically writing a few items needs improvement. Quantitative analysis using the Aiken index and CVR shows all items are valid with good validity. The reliability of the test is stable, with a coefficient of 0.83. Then, the test instrument consists of items with very difficult, difficult, and easy levels. All items were able to distinguish the test taker's ability and the effectiveness of the distractor to function correctly.
Keywords: classical test theory, biserial point correlation, validity, reliability.
DOI: http://dx.doi.org/10.23960/jpmipa/v22i1.pp98-111
Adegoke, B.A. (2013). Comparison of item statistics of Physics achievement test using classical test and item response theory frameworks. Journal of Education and Practice, 4(22), 87-96.
Aiken, L.R. (1985). Three Coefficients for Analyzing the Reliability and Validity of Ratings. Educational and Psychological Measurement, 45, 131-134.
Alifa, T.F., Ramalis, T.R., & Purwana, U. (2018). Karakter Tes Penalaran Ilmiah Siswa SMA Materi Mekanika Berdasarkan Analisis Tes Teori Respon Butir. Jurnal Inovasi dan Pembelajaran Fisika, 5(1), 80-89.
Allen, M.J., & Yen, W. (1979). Introduction to measurement theory. Monterey, CA:
Brooks/Cole Publishing Company.
Awopeju, O.A., & Afolabi, E.R.I. (2016). Comparative Analysis of Classical Test Theory and Item Response Theory Based Item Parameter Estimates of Senior School Certificate Mathematics Examination. European Scientific Journal, 12(28), 263284.
Azwar, S. (2009). Tes, prestasi, fungsi, dan pengembangan pengukuran prestasi belajar [Test, achievement, function, and development of learning achievement measurements]. Yogyakarta: Pustaka Pelajar Offset.
Azwar, S. (2013). Metode Penelitian [Research methods]. Yogyakarta: Pustaka Pelajar Offset.
Azwar, S. (2017). Reliabilitas dan Validitas [Reliability and Validity]. Yogyakarta: Pustaka Pelajar Offset.
Bichi, A.A., Embong, R.B., Mamat, M., & Maiwada, D.A. (2015). Comparison of Classical Test Theory and Item Response Theory: A Review of Empirical Studies. Australian Journal of Basic and Applied Sciences, 9, 549-556.
Bichi, A.A., Embong, R., Talib, R., Salleh, S., & Ibrahim, A. (2019). Comparative Analysis of Classical Test Theory and Item Response Theory using Chemistry Test. International Journal of Engineering and Advanced Technology (IJEAT), 8(5), 1260-1266.
Crocker, L., & Algina, J. (2008). lntroduction to classical & modern test theory. USA: Cengage Learning.
Delgado-Rico, E., Carretero-Dios, H., & Ruch, W. (2012). Content validity evidences in test development: an applied perspective. International Journal of Clinical and Health Psychology España, 12(3), 449-460.
Ebel, R. L., & Frisbie, D. A. (1986). Essentials of educational measurement. New Jersey: Prentice Hall, Inc.
Eleje, L.I., Onah, F.E., & Abanobi, C.C. (2018). Comparative Study of Classical Test Theory and Item Response Theory Using Diagnostic Quantitative Economics Skill Test Item Analysis Results. European Journal of Educational & Social Sciences, 3(1), 57-75.
Fernanda, J.W., & Hidayah, N. (2020). Analisis Kualitas Soal Ujian Statistika
Menggunakan Classical Test Theory dan Rasch Model [Analysis of the Quality of Statistics Exam Questions Using Classical Test Theory and the Rasch Model]. SQUARE: Journal of Mathemtics and Mathematics Education, 2(1), 49-60. http://dx.doi.org/10.21580/square.2020.2.1.5363
Gilbert, G.E., & Prion S. (2016). Making Sense of Methods and Measurement: Lawshe's Content Validity Index. Clinical Simulation in Nursing, 12(12), 530-531.
Guilford, J. P. (1956). Fundamental Statistic in Psychology and Education (3rd ed.). New York: McGraw-Hill Book Company, Inc.
Guler N., Uyanik, G.K., & Teker, G.T. (2014). Comparison of classical test theory and item response theory in terms of item parameters. European Journal of Educational Research, 2(1), 1-6.
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Education Measurement: Issues and Practice, 12(3), 38-47.
Hinkin, T. R., Tracey, J.B., & Enz, C.A. (1997). Scale construction: developing reliable and valid measurement instruments. Journal of Hospitality & Tourism Research, 21, 001, 100-120.
Ikhsanudin & Subali. (2018). Content validity analysis of first semester formative test on biology subject for senior high school. J. Phys.: Conf. Ser. 1097 (012039). 1-9. doi :10.1088/1742-6596/1097/1/012039
Istiyono, E. (2018). Pengembangan Instrumen Penilaian dan Analisis Hasil Belajar Fisika dengan Teori Tes Klasik dan Modern [Development of Assessment Instruments and Analysis of Physics Learning Outcomes with Classical and Modern Test Theory]. Yogyakarta: UNY Press.
Jabrayilov, R., Emons, W.H.M., & Sijtsma, K. (2016). Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment. Applied Psychological Measurement, 40(8), 559-572.
Lawshe, C.H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 63-575.
Linn, R. L. (1989). Educational measurement. New York: Mac Millan Publishing.
Mardapi, D. (1998). Analisis butir dengan teori klasik dan teori respon butir [Item Analysis with Classical Theory and Item Response Theory]. Jurnal Kependidikan, 28(1), 25–34.
Mardapi, D. (2008). Teknik penyusunan instrumen tes dan non tes [The technique of preparing test and non-test instruments]. Yogyakarta: Mitra Cendekia Offset.
Mardapi, D. (2016). Pengukuran, Penilaian, dan Evaluasi Pendidikan [Measurement, assessment and evaluation of education]. Yogyakarta: Pustaka Pelajar.
Mardapi, D. (2017). Pengukuran, penilaian dan evaluasi pendidikan [Measurement, assessment and evaluation of education]. Yogyakarta: Parama Publishing.
Mehrens, W. A. & Lehmann, I. J. (1984). Measurement and evaluation in educational and psychology. New York: Rinehart and Winston.
Miller, M.D., Lin, R.L., & Gronlund, N.E. (2009). Measurement and Assessment in Teaching. New Jersey: Pearson Educational, Inc.
Oriondo, L.L. & Dallo-Antonio, E. (1998). Evaluation educational outcomes. Florentino St: Rex Printing Compagny, Inc.
Pido, S. (2012). Comparison of item analysis results obtained using item response theory and classical test theory approaches. Journal of Educational Assessment in Africa, 7, 192–207.
Polit, D.F., & Beck, C.T. (2006). The content validity index: are you sure you know what’s being reported? Critique and recommendations. Research in Nursing & Health, 29, 489497.
Polit, D., Beck, C., & Owen, S. (2007). Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Research in Nursing and Health, 30(4), 459-467.
Rao, C., L, K. P. H., Sajitha, K., Permi, H., & Shetty, J. (2016). Item analysis of multiple choice questions : Assessing an assessment tool in medical students. International Journal of Educational and Psychological Researches, 2–5.
Retnawati, H. (2016). Validitas, Reliabilitas dan Karakteristik Butir [Validity, Reliability and Item Characteristics]. Yogyakarta: Nuha Medika.
Royce, H. (2009). Comparison of the item discrimination and item difficulty of the quickmental aptitude test using CTT and IRT methods. The International Journal of Educational and Psychological Assessment, 1(1), 12-18.
Sarea, M. S., & Ruslan, R. (2019). Karakteristik butir soal: Classical test theory vs item response theory [Item characteristics: Classical test theory vs item response theory].
Jurnal Kependidikan, 13(1), 1-13.
Setiawati, F. A., Izzaty, R. E., & Hidayat, V. (2018). Evaluasi Karakteristik Psikometrik Tes Bakat Differensial dengan Teori Klasik [Evaluation of Psychometric Characteristics of Differential Aptitude Test with Classical Theory]. Humanitas Indonesian Psychological Journal, 15(1), 46-61.
Subali, B. (2016). Pengembangan Tes: Beserta Penyelidikan Validitas dan Reliabititas secara Empiris [Test Development: Along with Empirical Validity and Reliability Investigation]. Yogyakarta: UNY Press.
Sudjiono, A. (2005). Pengantar Evaluasi Pendidikan [Introduction to Educational Evaluation]. Jakarta: Paja Grafindo Persada.
Suhariyono, Sriyono, & Ngazizah, N. (2014). Akurasi pendekatan classical test theory dan pendekatan item response theory dalam menganalisis soal UAS Fisika semester genap kelas X SMA Negeri di Purworejo tahun pelajaran 2013/2014 [The accuracy of the classical test theory approach and the item response theory approach in analyzing the UAS Physics questions in the even semester of class X SMA Negeri in Purworejo academic year 2013/2014]. Radiasi: Jurnal Berkala Pendidikan Fisika, 5(2), 75-79.
Sunarmi, Prasetyo, T. I., & Ramadhiana, C. B. (2016). Analisis butir soal ulangan akhir semester gasal Biologi kelas X dan XI tahun pelajaran 2016/2017 di SMAN 1 Kampak berdasarkan teori tes klasik [Analysis of test items at the end of odd semester of Biology class X and XI for the 2016/2017 academic year at SMAN 1 Kampak based on classical test theory]. Jurnal Pendidikan Biologi, 8(1), 27-31.
Vakili, M.M., & Jahangiri, N. (2018). Content Validity and Reliability of the Measurement Tools in Educational, Behavioral, and Health Sciences Research. Journal of Medical Education Development, 10(28), 105-117.
Wynd, C.A., Schmidt, B., and Schaefer, M.A. (2003). Two quantitative approaches for estimating instrument content validity. Western Journal of Nursing Research, 25, 508-518.
Yudiana, Y., Hidayat, Y., Hambali, B., & Slamat, S. (2017). Content Validity Estimation of Assessment Instrument Based on Volleyball Information System of Volleyball Learning: Field Research. J. Phys.: Conf. Ser.: Mater. Sci. Eng., 180 (012230). 16.
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
 


 
         
         
         
        