Measuring Critical Thinking in Physics: A Rasch Analysis of Instrument Quality and Gender Equivalence

Maria Goreti Halim(1,Mail), Duden Saepuzaman(2), Lina Aviyanti(3), Judhistira Aria Utama(4), Abu Nawas(5) | CountryCountry:


(1) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(2) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(3) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(4) Department of Physics Education, Universitas Pendidikan Indonesia, Indonesia
(5) School of Education, Adelaide University, Australia

MailCorresponding Author

Metrics Analysis (Dimensions & PlumX)

Indexing:
Similarity:

© 2026 Maria Goreti Halim, Duden Saepuzaman, Lina Aviyanti, Judhistira Aria Utama, Abu Nawas

This study seeks to evaluate the quality of a Critical Thinking Skills (CTS) instrument for high school students on dynamic fluids, focusing on reliability, item validity, and respondent ability assessment using the Rasch model. This research utilized a quantitative technique with a descriptive design. The research sample comprised 200 11th-grade science students from several high schools in Manggarai Regency, East Nusa Tenggara, Indonesia, of whom 140 were female, and 60 were male. Data were analyzed with the Rasch modeling approach with the assistance of WINSTEPS software version 3.73. The findings indicated that the instrument exhibited generally acceptable psychometric properties, with a Cronbach’s Alpha of 0.89 and a reliability of 0.93, indicating strong internal consistency and measurement stability. However, Rasch model analysis revealed that approximately 10% of the items did not fit the model expectations (misfit), around 15% indicated potential gender bias based on Differential Item Functioning (DIF) analysis, and 20,5% of respondents showed misfit response patterns. These results suggest that, while the overall reliability indices were high, certain items and response patterns require further refinement to achieve optimal measurement precision and fairness. The person reliability score of 0.83 indicated that the instrument reliably and accurately differentiated between varied levels of responder competence in assessing critical thinking abilities. In conclusion, this CTS instrument demonstrates overall acceptable measurement quality within the Rasch framework, although several psychometric limitations remain evident. These findings position the instrument as a preliminary yet functional assessment tool for measuring students’ critical thinking skills in dynamic fluid topics, while highlighting the importance of continued empirical validation. Future studies are encouraged to expand the range of item difficulty, re-examine items exhibiting misfit and gender-related DIF, and involve more diverse samples to enhance measurement precision, fairness, and generalizability.

 

Keywords: critical thinking skills, Rasch model, dynamic fluids.

Keywords: critical thinking skills; rasch model; dynamic fluids

Abrami, P. C., Bernard, R. M., Borokhovski, E., Waddington, D. I., & Wade, C. A. (2015). Strategies for teaching students to think critically : a meta-analysis. Review of Educational Research, 85(2), 275–314. https://doi.org/10.3102/0034654314551063

Akhvlediani, M., Abdaladze, L., & Lataria, K. (2023). The Challenges of the XXI century–development of critical thinking among the students. KREBSI, 15, 275–285.

Alyaumusyifa, N., Rahma, I., & Kaniawati, I. (2024). Characterization of An Instrument Test to Measure Students ’ Creative Thinking Skills ( CTS ) of Static Fluid Topic Based on Rasch Model Analysis. Sustainability Education, 1(1), 159–171.

Arifin, Z., Sukarmin, Saputro, S., & Kamari, A. (2025). The effect of inquiry-based learning on students ’ critical thinking skills in science education : A systematic review and meta-analysis. EURASIA Journal of Mathematics, Science and Technology Education, 21(3). https://doi.org/https://doi.org/10.29333/ejmste/15988

Arslan, S. (2010). Traditional instruction of differential equations and conceptual learning. Teaching Mathematics and Its Applications, 29, 94–107. https://doi.org/10.1093/teamat/hrq001

Badmus, O. T., & Jita, L. C. (2024). Physics difficulty and problem-solving: Exploring the role of mathematics and mathematical symbols. Interdisciplinary Journal of Education Research, 6, 1–14. https://doi.org/10.38140/ijer-2024.vol6.08

Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences, 3rd ed. Routledge/Taylor & Francis Group. https://psycnet.apa.org/record/2015-33472-000

Bondarev, V. N., Bezverkhii, P. P., & Kosenko, S. I. (2013). Analysis of experimental data within the statistical theory of critical phenomena. Russian Journal of Physical Chemistry A, 87(11), 1838–1844.

Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Springer.

Chen, X. (2021). Quantitative descriptive epidemiology. In Quantitative Epidemiology. Springer. https://doi.org/10.1007/978-3-030-83852-2_3

Christensen, W. M., & Thompson, J. R. (2010). Investigating student understanding of physics concepts and the underlying calculus concepts in thermodynamics. Proceedings of the 13th Annual Conference on Research in Undergraduate Mathematics Education, March 2010, 1–24. http://adsabs.harvard.edu/abs/2010APS..MARH42004T

Ciesla, J. R., & Yao, P. (2011). Validation of a targeted peer relations scale for adolescents treated for substance use disorder : an application of rasch modeling. Substance Abuse : Research and Treatment, 35–44. https://doi.org/10.4137/SART.S7367

Cottrell, S. (2017). Critical thinking skills : effective analysis, argument, and reflection. Bloomsbury Publishing, 100.

Dwyer, C. P., Hogan, M. J., & Stewart, I. (2014). An integrated critical thinking framework for the 21st century. Thinking Skills and Creativity, 12, 43–52. https://doi.org/https://doi.org/10.1016/j.tsc.2013.12.004

Etkina, E., & Planinšič, G. (2015). Defining and developing “critical thinking” through devising and testing multiple explanations of the same phenomenon. The Physics Teacher, 53(7), 432–437. https://doi.org/10.1119/1.4931014

Facione, P. A. (2015). Permission to reprint for non-commercial uses critical thinking: what it is and why it counts. Insight Assessment, 5(1), 1–30. www.insightassessment.com

Farzad, M., Layeghi, F., Hosseini, A., Whiteneck, G., & Asgari, A. (2017). Using the rasch model to develop a measure of participation capturing the full range of participation characteristics for the patients with hand injuries. Journal of Hand and Microsurgery, 09(02), 084–091. https://doi.org/10.1055/s-0037-1604060

Fisher, W. P. (2007). Rating scale instrument quality criteria. Rasch Measurement Transactions, 1095.

García-Carmona, A. (2023). Scientific thinking and critical thinking in science education: two distinct but symbiotically related intellectual processes. Science and Education, 34(1), 227–245. https://doi.org/10.1007/s11191-023-00460-5

Halimatun Sa, L., Siahaan, P., Suhendi, E., Samsudin, A., Hadiana Aminudin, A., Rais, A., Sari, I., & Rachmadtullah, R. (2020). Critical thinking instrument test (ctit): developing and analyzing sundanese students’ critical thinking skills on physics concepts using rasch analysis. International Journal of Psychosocial Rehabilitation, 24(January), 2020. https://doi.org/10.37200/IJPR/V24I8/PR281423

Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development: An NCME instructional module. Educational Measurement: Issues and Practice, 12(3), 253–262.

Hurrell, D. P. (2021). Conceptual knowledge OR Procedural knowledge OR Conceptual knowledge AND Procedural knowledge : Why the conjunction is important for teachers. Conceptual Knowledge, Procedural Knowledge, or Conjunction is Important to Teachers. Australian Journal of Teacher Education, 46(2).

Jong, T. De. (2010). Cognitive load theory, educational research, and instructional design: some food for thought. Instructional Science, 105–134. https://doi.org/10.1007/s11251-009-9110-0

Kassiavera, S., Suparmi, A., & Cari, C. (2024). ISSN 1648-3898 ISSN 2538-7138 Application of rasch model in two-tier test for assessing critical thinking in physics education. Journal of Baltic Science Education, 23(6), 1227–1242. https://doi.org/https://doi.org/10.33225/jbse/24.23.1227

Krange, I., & Ludvigsen, S. (2008). What does it mean ? Students ’ procedural and conceptual problem solving in a CSCL environment designed within the field of science education. Computer-Supported Collaborative Learning, 25–51. https://doi.org/10.1007/s11412-007-9030-4

Lestari, A. S., & Samsudin, A. (2020). Using rasch model analysis to analyze students’ scientific literacy on heat and temperature. Proceedings of the 7th Mathematics, Science, and Computer Science Education International Seminar, MSCEIS 2019, 1–8. https://doi.org/10.4108/eai.12-10-2019.2296483

Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 878.

Linacre, J. M. (2016). A User’s Guide to WINSTEPS: Rasch Model Computer Programs. Mesa Press.

Liu, Z., Guo, H., Zhou, Z., Ma, F., & Zeng, Y. (2025). How creative self-efficacy influences problem- solving skills in engineering education : the dual mediating role of critical thinking and metacognition. BMC Psychology. https://doi.org/https://doi.org/10.1186/s40359-025-03630-y

Morris, B. J., Croker, S., Zimmerman, C., Gill, D., & Romig, C. (2013). Gaming science : the “ gamification ” of scientific thinking. science education and its role in the twenty-first century. Frontiers in Psychology, 4(September), 1–16. https://doi.org/10.3389/fpsyg.2013.00607

Nunez, W. A., Scott, M. C. B., & Morgan, D. E. (2025). struggle with higher-order cognitive skills in multiple course formats. Journal of Microbiology & Biology Education, 26(2).

Nurdini, N., Suhandi, A., Ramalis, T., & Samsudin, A. (2020). Developing multitier instrument of fluids concepts ( mifo ) to measure student ’ s conception : a rasch analysis approach. Journal of Advanced Research in Dynamical and Control Systems ·, October. https://doi.org/10.5373/JARDCS/V12I6/S20201273

Nwabuko, O. C., Iwu, L. O., Njoku, P. U., & Nwamoh, U. N. (2024). An overview of research study designs in quantitative research methodology. American Journal of Medical and Clinical Research & Reviews, 3(5), 1–6. https://doi.org/10.58372/2835-6276.1169

Omolade, O. K. (2025). Wright map analysis to determine nurses and midwives ’ knowledge of treatment of primary postpartum haemorrhage in Nigeria. International Medical Education, 4(6), 1–13. https://doi.org/https://doi.org/10.3390/ime4020006

Paas, F., & Van Merriënboer, J. J. G. (2020). Cognitive-Load theory : methods to manage working memory load in the learning of complex tasks. Current Directions in Psychological Science, 29(4), 394–398. https://doi.org/10.1177/0963721420922183

Pasquinelli, E., & Richard, O. (2023). Critical thinking as the ability to sort and qualify the information available, to form one’s own judgement. European Journal of Education, 58(3), 422–433. https://doi.org/https://doi.org/10.1111/ejed.12565

Price, O., & Lovell, K. (2018). Quantitative research design. In A research handbook for patient and public involvement researchers (pp. 40–50). Manchester University Press.

Razak, A. A. (2022). Improving critical thinking skills in teaching through problem-based learning for students : a scoping review. International Journal of Learning, Teaching and Educational Research, 21(2), 342–362.

Rivas, S. F., Saiz, C., & Ossa, C. (2022). Metacognitive strategies and development of critical thinking in higher education. Frontiers in Psychology, 13, 913219.

Rusmin, L., Misrahayu, Y., Pongpalilu, F., Radiansyah, & Dwiyanto. (2024). Social critical thinking and problem-solving skills in the 21st century open access. Join: Journal of Social Science, 1(5), 144–162.

Samsudin, A., Aminudin, A. H., Novia, H., Suhandi, A., Fratiwi, N. J., Yusup, M., Supriyatman, S., Masrifah, M., Adimayuda, R., Prahani, B. K., Wibowo, F. C., Faizin, M. N., & Costu, B. (2023). Identifying javanese students’ conceptions on fluid pressure with wright map analysis of rasch. Journal of Natural Science and Integration, 6(2), 173. https://doi.org/10.24014/jnsi.v6i2.21822

Sanjaya, A. P., Tanjung, Y. I., & Mihardi, S. (2024). Development of physics test instruments to measure problem solving skills on dynamic fluid materials. Journal of Innovative Physics Teaching, 2(2), 89–101.

Santos, L. F. (2017). The role of critical thinking in science education. Journal of Education and Practice, 8(20), 159–173.

Slater, P., & Hasson, F. (2024). Quantitative research designs, hierarchy of evidence and validity. Journal of Psychiatric and Mental Health Nursing, 656–660. https://doi.org/10.1111/jpm.13135

Stéphan Vincent-Lancrin. (2024). Critical thinking. In Elgar Encyclopedia of Interdisciplinarity and Transdisciplinarity (pp. 124–128). https://doi.org/10.4337/9781035317967.ch27

Sujatmika, S., Sutarno, Masykuri, M., & Prayitno, B. A. (2025). Applying the Rasch model to measure students’ critical thinking skills on the science topic of the human circulatory system. Eurasia Journal of Mathematics, Science and Technology Education, 21(4). https://doi.org/10.29333/ejmste/16221

Sumintono, B., & Widhiarso, W. (2015). Aplikasi pemodelan Rasch pada assessment pendidikan. Trim Komunikata Publishing House.

Sya’Bandari, Y., Firman, H., & Rusyati, L. (2018). The validation of science virtual test to assess 7th grade students’ critical thinking on matter and heat topic (SVT-MH). Journal of Science Learning, 1013(1). https://doi.org/10.1088/1742-6596/1013/1/012067

Thornhill-Miller, B., Camarda, A., Mercier, M., Burkhardt, J.-M., Morisseau, T., Bourgeois-Bougrine, S., & Lubart, T. (2023). Creativity, critical thinking, communication, and collaboration: Assessment, certification, and promotion of 21st century skills for the future of work and education. Journal of Intelligence, 11(3), 54.

Tiruneh, D. T., De Cock, M., Weldeslassie, A. G., Elen, J., & Janssen, R. (2017). Measuring critical thinking in physics: development and validation of a critical thinking test in electricity and magnetism. International Journal of Science and Mathematics Education, 15(4), 663–682. https://doi.org/10.1007/s10763-016-9723-0

Walker, A. A., & Wind, S. A. (2020). Identifying misfitting achievement estimates in performance assessments: an illustration using rasch and mokken scale analyses. International Journal of Testing, 20(3), 231–251. https://doi.org/10.1080/15305058.2019.1673758

Wright, B. D. & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 370–371.

Wright, B. D. & Stone, M. H. (1999). Measurement essentials. Wide Range, Inc.

Wright, B. D. (1999). Rasch measurement model. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment. Pergamon (Elsevier Science).

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Mesa Press.

Young, H. D., & Freedman, R. A. (2012). Sears and Zemansky’s University Physics with Modern Physics (13th ed.). Addison-Wesley.

Žakelj, A., & Štemberger, T. (2025). An empirical study on basic and conceptual knowledge, procedural knowledge and problem solving among primary school students. International Journal of Instruction, 18(4), 627–650.

Zuhaida, A., Zuhri, M. K., & Yusuf Al Ayyubi, S. H. (2022). Analysis of students’ critical thinking skills through science, technology, engineering and mathematics (STEM) approach. Nucleation and Atmospheric Aerosols, 2600(1). https://doi.org/10.1063/5.0112996

Research Instrument
Supplementary Files

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.