Measuring Critical Thinking in Physics: A Rasch Analysis of Instrument Quality and Gender Equivalence
Country:
(1) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(2) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(3) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(4) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(5) School of Education, Adelaide University, Australia
This study seeks to evaluate the quality of a Critical Thinking Skills (CTS) instrument for high school students on dynamic fluids, focusing on reliability, item validity, and respondent ability assessment using the Rasch model. This research utilized a quantitative technique with a descriptive design. The research sample comprised 200 11th-grade science students from several high schools in Manggarai Regency, East Nusa Tenggara, Indonesia, of whom 140 were female, and 60 were male. Data were analyzed with the Rasch modeling approach with the assistance of WINSTEPS software version 3.73. The findings indicated that the instrument exhibited generally acceptable psychometric properties, with a Cronbach’s Alpha of 0.89 and a reliability of 0.93, indicating strong internal consistency and measurement stability. However, Rasch model analysis revealed that approximately 10% of the items did not fit the model expectations (misfit), around 15% indicated potential gender bias based on Differential Item Functioning (DIF) analysis, and 20,5% of respondents showed misfit response patterns. These results suggest that, while the overall reliability indices were high, certain items and response patterns require further refinement to achieve optimal measurement precision and fairness. The person reliability score of 0.83 indicated that the instrument reliably and accurately differentiated between varied levels of responder competence in assessing critical thinking abilities. In conclusion, this CTS instrument demonstrates overall acceptable measurement quality within the Rasch framework, although several psychometric limitations remain evident. These findings position the instrument as a preliminary yet functional assessment tool for measuring students’ critical thinking skills in dynamic fluid topics, while highlighting the importance of continued empirical validation. Future studies are encouraged to expand the range of item difficulty, re-examine items exhibiting misfit and gender-related DIF, and involve more diverse samples to enhance measurement precision, fairness, and generalizability.
Keywords: critical thinking skills, Rasch model, dynamic fluids.
Abrami, P. C., Bernard, R. M., Borokhovski, E., Waddington, D. I., & Wade, C. A. (2015). Strategies for teaching students to think critically : a meta-analysis. Review of Educational Research, 85(2), 275–314. https://doi.org/10.3102/0034654314551063
Akhvlediani, M., Abdaladze, L., & Lataria, K. (2023). The Challenges of the XXI century–development of critical thinking among the students. KREBSI, 15, 275–285.
Alyaumusyifa, N., Rahma, I., & Kaniawati, I. (2024). Characterization of An Instrument Test to Measure Students ’ Creative Thinking Skills ( CTS ) of Static Fluid Topic Based on Rasch Model Analysis. Sustainability Education, 1(1), 159–171.
Arifin, Z., Sukarmin, Saputro, S., & Kamari, A. (2025). The effect of inquiry-based learning on students ’ critical thinking skills in science education : A systematic review and meta-analysis. EURASIA Journal of Mathematics, Science and Technology Education, 21(3). https://doi.org/https://doi.org/10.29333/ejmste/15988
Arslan, S. (2010). Traditional instruction of differential equations and conceptual learning. Teaching Mathematics and Its Applications, 29, 94–107. https://doi.org/10.1093/teamat/hrq001
Badmus, O. T., & Jita, L. C. (2024). Physics difficulty and problem-solving: Exploring the role of mathematics and mathematical symbols. Interdisciplinary Journal of Education Research, 6, 1–14. https://doi.org/10.38140/ijer-2024.vol6.08
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences, 3rd ed. Routledge/Taylor & Francis Group. https://psycnet.apa.org/record/2015-33472-000
Bondarev, V. N., Bezverkhii, P. P., & Kosenko, S. I. (2013). Analysis of experimental data within the statistical theory of critical phenomena. Russian Journal of Physical Chemistry A, 87(11), 1838–1844.
Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Springer.
Chen, X. (2021). Quantitative descriptive epidemiology. In Quantitative Epidemiology. Springer. https://doi.org/10.1007/978-3-030-83852-2_3
Christensen, W. M., & Thompson, J. R. (2010). Investigating student understanding of physics concepts and the underlying calculus concepts in thermodynamics. Proceedings of the 13th Annual Conference on Research in Undergraduate Mathematics Education, March 2010, 1–24. http://adsabs.harvard.edu/abs/2010APS..MARH42004T
Ciesla, J. R., & Yao, P. (2011). Validation of a targeted peer relations scale for adolescents treated for substance use disorder : an application of rasch modeling. Substance Abuse : Research and Treatment, 35–44. https://doi.org/10.4137/SART.S7367
Cottrell, S. (2017). Critical thinking skills : effective analysis, argument, and reflection. Bloomsbury Publishing, 100.
Dwyer, C. P., Hogan, M. J., & Stewart, I. (2014). An integrated critical thinking framework for the 21st century. Thinking Skills and Creativity, 12, 43–52. https://doi.org/https://doi.org/10.1016/j.tsc.2013.12.004
Etkina, E., & Planinšič, G. (2015). Defining and developing “critical thinking” through devising and testing multiple explanations of the same phenomenon. The Physics Teacher, 53(7), 432–437. https://doi.org/10.1119/1.4931014
Facione, P. A. (2015). Permission to reprint for non-commercial uses critical thinking: what it is and why it counts. Insight Assessment, 5(1), 1–30. www.insightassessment.com
Farzad, M., Layeghi, F., Hosseini, A., Whiteneck, G., & Asgari, A. (2017). Using the rasch model to develop a measure of participation capturing the full range of participation characteristics for the patients with hand injuries. Journal of Hand and Microsurgery, 09(02), 084–091. https://doi.org/10.1055/s-0037-1604060
Fisher, W. P. (2007). Rating scale instrument quality criteria. Rasch Measurement Transactions, 1095.
García-Carmona, A. (2023). Scientific thinking and critical thinking in science education: two distinct but symbiotically related intellectual processes. Science and Education, 34(1), 227–245. https://doi.org/10.1007/s11191-023-00460-5
Halimatun Sa, L., Siahaan, P., Suhendi, E., Samsudin, A., Hadiana Aminudin, A., Rais, A., Sari, I., & Rachmadtullah, R. (2020). Critical thinking instrument test (ctit): developing and analyzing sundanese students’ critical thinking skills on physics concepts using rasch analysis. International Journal of Psychosocial Rehabilitation, 24(January), 2020. https://doi.org/10.37200/IJPR/V24I8/PR281423
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development: An NCME instructional module. Educational Measurement: Issues and Practice, 12(3), 253–262.
Hurrell, D. P. (2021). Conceptual knowledge OR Procedural knowledge OR Conceptual knowledge AND Procedural knowledge : Why the conjunction is important for teachers. Conceptual Knowledge, Procedural Knowledge, or Conjunction is Important to Teachers. Australian Journal of Teacher Education, 46(2).
Jong, T. De. (2010). Cognitive load theory, educational research, and instructional design: some food for thought. Instructional Science, 105–134. https://doi.org/10.1007/s11251-009-9110-0
Kassiavera, S., Suparmi, A., & Cari, C. (2024). ISSN 1648-3898 ISSN 2538-7138 Application of rasch model in two-tier test for assessing critical thinking in physics education. Journal of Baltic Science Education, 23(6), 1227–1242. https://doi.org/https://doi.org/10.33225/jbse/24.23.1227
Krange, I., & Ludvigsen, S. (2008). What does it mean ? Students ’ procedural and conceptual problem solving in a CSCL environment designed within the field of science education. Computer-Supported Collaborative Learning, 25–51. https://doi.org/10.1007/s11412-007-9030-4
Lestari, A. S., & Samsudin, A. (2020). Using rasch model analysis to analyze students’ scientific literacy on heat and temperature. Proceedings of the 7th Mathematics, Science, and Computer Science Education International Seminar, MSCEIS 2019, 1–8. https://doi.org/10.4108/eai.12-10-2019.2296483
Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 878.
Linacre, J. M. (2016). A User’s Guide to WINSTEPS: Rasch Model Computer Programs. Mesa Press.
Liu, Z., Guo, H., Zhou, Z., Ma, F., & Zeng, Y. (2025). How creative self-efficacy influences problem- solving skills in engineering education : the dual mediating role of critical thinking and metacognition. BMC Psychology. https://doi.org/https://doi.org/10.1186/s40359-025-03630-y
Morris, B. J., Croker, S., Zimmerman, C., Gill, D., & Romig, C. (2013). Gaming science : the “ gamification ” of scientific thinking. science education and its role in the twenty-first century. Frontiers in Psychology, 4(September), 1–16. https://doi.org/10.3389/fpsyg.2013.00607
Nunez, W. A., Scott, M. C. B., & Morgan, D. E. (2025). struggle with higher-order cognitive skills in multiple course formats. Journal of Microbiology & Biology Education, 26(2).
Nurdini, N., Suhandi, A., Ramalis, T., & Samsudin, A. (2020). Developing multitier instrument of fluids concepts ( mifo ) to measure student ’ s conception : a rasch analysis approach. Journal of Advanced Research in Dynamical and Control Systems ·, October. https://doi.org/10.5373/JARDCS/V12I6/S20201273
Nwabuko, O. C., Iwu, L. O., Njoku, P. U., & Nwamoh, U. N. (2024). An overview of research study designs in quantitative research methodology. American Journal of Medical and Clinical Research & Reviews, 3(5), 1–6. https://doi.org/10.58372/2835-6276.1169
Omolade, O. K. (2025). Wright map analysis to determine nurses and midwives ’ knowledge of treatment of primary postpartum haemorrhage in Nigeria. International Medical Education, 4(6), 1–13. https://doi.org/https://doi.org/10.3390/ime4020006
Paas, F., & Van Merriënboer, J. J. G. (2020). Cognitive-Load theory : methods to manage working memory load in the learning of complex tasks. Current Directions in Psychological Science, 29(4), 394–398. https://doi.org/10.1177/0963721420922183
Pasquinelli, E., & Richard, O. (2023). Critical thinking as the ability to sort and qualify the information available, to form one’s own judgement. European Journal of Education, 58(3), 422–433. https://doi.org/https://doi.org/10.1111/ejed.12565
Price, O., & Lovell, K. (2018). Quantitative research design. In A research handbook for patient and public involvement researchers (pp. 40–50). Manchester University Press.
Razak, A. A. (2022). Improving critical thinking skills in teaching through problem-based learning for students : a scoping review. International Journal of Learning, Teaching and Educational Research, 21(2), 342–362.
Rivas, S. F., Saiz, C., & Ossa, C. (2022). Metacognitive strategies and development of critical thinking in higher education. Frontiers in Psychology, 13, 913219.
Rusmin, L., Misrahayu, Y., Pongpalilu, F., Radiansyah, & Dwiyanto. (2024). Social critical thinking and problem-solving skills in the 21st century open access. Join: Journal of Social Science, 1(5), 144–162.
Samsudin, A., Aminudin, A. H., Novia, H., Suhandi, A., Fratiwi, N. J., Yusup, M., Supriyatman, S., Masrifah, M., Adimayuda, R., Prahani, B. K., Wibowo, F. C., Faizin, M. N., & Costu, B. (2023). Identifying javanese students’ conceptions on fluid pressure with wright map analysis of rasch. Journal of Natural Science and Integration, 6(2), 173. https://doi.org/10.24014/jnsi.v6i2.21822
Sanjaya, A. P., Tanjung, Y. I., & Mihardi, S. (2024). Development of physics test instruments to measure problem solving skills on dynamic fluid materials. Journal of Innovative Physics Teaching, 2(2), 89–101.
Santos, L. F. (2017). The role of critical thinking in science education. Journal of Education and Practice, 8(20), 159–173.
Slater, P., & Hasson, F. (2024). Quantitative research designs, hierarchy of evidence and validity. Journal of Psychiatric and Mental Health Nursing, 656–660. https://doi.org/10.1111/jpm.13135
Stéphan Vincent-Lancrin. (2024). Critical thinking. In Elgar Encyclopedia of Interdisciplinarity and Transdisciplinarity (pp. 124–128). https://doi.org/10.4337/9781035317967.ch27
Sujatmika, S., Sutarno, Masykuri, M., & Prayitno, B. A. (2025). Applying the Rasch model to measure students’ critical thinking skills on the science topic of the human circulatory system. Eurasia Journal of Mathematics, Science and Technology Education, 21(4). https://doi.org/10.29333/ejmste/16221
Sumintono, B., & Widhiarso, W. (2015). Aplikasi pemodelan Rasch pada assessment pendidikan. Trim Komunikata Publishing House.
Sya’Bandari, Y., Firman, H., & Rusyati, L. (2018). The validation of science virtual test to assess 7th grade students’ critical thinking on matter and heat topic (SVT-MH). Journal of Science Learning, 1013(1). https://doi.org/10.1088/1742-6596/1013/1/012067
Thornhill-Miller, B., Camarda, A., Mercier, M., Burkhardt, J.-M., Morisseau, T., Bourgeois-Bougrine, S., & Lubart, T. (2023). Creativity, critical thinking, communication, and collaboration: Assessment, certification, and promotion of 21st century skills for the future of work and education. Journal of Intelligence, 11(3), 54.
Tiruneh, D. T., De Cock, M., Weldeslassie, A. G., Elen, J., & Janssen, R. (2017). Measuring critical thinking in physics: development and validation of a critical thinking test in electricity and magnetism. International Journal of Science and Mathematics Education, 15(4), 663–682. https://doi.org/10.1007/s10763-016-9723-0
Walker, A. A., & Wind, S. A. (2020). Identifying misfitting achievement estimates in performance assessments: an illustration using rasch and mokken scale analyses. International Journal of Testing, 20(3), 231–251. https://doi.org/10.1080/15305058.2019.1673758
Wright, B. D. & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 370–371.
Wright, B. D. & Stone, M. H. (1999). Measurement essentials. Wide Range, Inc.
Wright, B. D. (1999). Rasch measurement model. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment. Pergamon (Elsevier Science).
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Mesa Press.
Young, H. D., & Freedman, R. A. (2012). Sears and Zemansky’s University Physics with Modern Physics (13th ed.). Addison-Wesley.
Žakelj, A., & Štemberger, T. (2025). An empirical study on basic and conceptual knowledge, procedural knowledge and problem solving among primary school students. International Journal of Instruction, 18(4), 627–650.
Zuhaida, A., Zuhri, M. K., & Yusuf Al Ayyubi, S. H. (2022). Analysis of students’ critical thinking skills through science, technology, engineering and mathematics (STEM) approach. Nucleation and Atmospheric Aerosols, 2600(1). https://doi.org/10.1063/5.0112996
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


