Browse > Article
http://dx.doi.org/10.14697/jkase.2010.30.2.275

An Analysis on Reliabilities of Scoring Methods and Rubric Ratings Number for Performance Assessments of Middle School Students' Science Investigation Activities  

Kim, Hyung-Jun (Gisan Middle School)
Yoo, June-Hee (Seoul National University)
Publication Information
Journal of The Korean Association For Science Education / v.30, no.2, 2010 , pp. 275-290 More about this Journal
Abstract
In this study, reliabilities of holistic scoring method and analytic scoring method were analyzed in performance assessments of middle school students' science investigation activity. Reliabilities of 2, 3, and 4~7-level rubric ratings for analytic scoring methods were compared to figure out optimized numbers of rubric ratings. Two trained raters rated four activity sheets of 60 students by two rating methods and three kinds of rubric ratings. Internal consistency reliabilities of holistic scoring methods were higher than those of analytic scoring methods, while intrarater reliabilities of analytic scoring were higher than those of holistic scoring methods. Internal consistency reliabilities and intra-rater reliabilities of 3-level rubric rating showed similar patterns of 4~7-level rubric ratings. But students' discriminations, item difficulties and item-response curves showed that the 3-level rubric ratings was reliable. These results suggest that holistic scoring method could be adapted to increase internal consistency reliabilities with improvement in intra-rater reliabilities by rater's conferences. Also, the 3-level rubric rating would be enough for good reliability in case of adapting analytic scoring methods.
Keywords
middle school student; science investigation activity; performance assessment; analytic scoring; holistic scoring; rubric ratings; reliability;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 박정, 홍미영(2002). 문항 유형에 따른 과학 능력 추정의 효율성 비교. 한국과학교육학회지. 22(1), 122-131.   과학기술학회마을
2 Black, P. J. (1990). APU science - the past and the future. School Science Review, 72(258), 28-43.
3 Black, P. J. (1998). Testing: friend or foe? : Theory and Practice of Assessment and Testing. Falmer Press.
4 Woolnough, B. E. (1989). Toward holistic view of precesses in science education, in J. Wellington (Ed.) Skills and processes in science education: a critical analysis(pp. 115-134). London: Routledge.
5 Wilson, M., Sloane, K., Roberts, L., & Henke, R. (1995). Setup course I, issues, evidence and you: Achievement evidence from pilot implementation. University of California, Berkeley.
6 Waltman, K., Kahn, A., & Koency, G. (1998). Alternative approaches to scoring: The effects of using different scoring methods on the validity of scores from a performance assessment. CSE Technical Report, 488.
7 Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181-208.   DOI   ScienceOn
8 American Association for the Advancement of Science. (2001). ATLAS of science literacy. Vol. 1. American Association for the Advancement of Science.
9 지은림 (2000). 논술형 수행평가를 위한 채점방법들의 비교. 경희대학교 교육문제연구소 논문집, 16, 235-246
10 한국교육과정평가원 (2001). 제7차 교육과정에 따른 성취기준 평가기준.
11 이관용, 김기중 (1993). 기초 심리통계학, 법문사.
12 지은림 (1999). 사회과 보고서 수행평가를 위한 총체적 채점과 분석적 채점의 비교. 교육평가연구, 12(2), 11-24.
13 이규민 (2007). 초등학교 과학과 수행평가의 총체적 채점과 분석적 채점 방식에 대한 일반화가능도분석. 아동교육, 16(4), 169-184.
14 이기영, 안희수 (2005). 중등학교 과학 수행평가의 평가 유형과 채점 방식 및 신뢰도 분석. 한국과학교육학회지, 25(2), 173-183.
15 Qualification and Curriculum Authority (2007). Science: Programme of study for key stage 4 in the national curriculum 2007. London: Qualifications and Curriculum Development Agency.
16 Roberts, L., Wilson, M., & Draney, K. (1997) The setup assessment system: An overview. BEAR report series, SA-97-1, University of California, Berkeley.
17 유준희, 박승재 (1999). 과학과 수행평가. 열린교육연구, 7(1), 247-262
18 Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O'Sullivan, C. Y., Arora, A., & Erberber, E. (2005). TIMSS 2007 assessment frameworks. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College.
19 Plake, B. S., & Hambleton, R. K. (1999). A standard-setting method designed for complex performance assessments: categorical assessments of student work. Educational Assessment, 6(3), 197-215.
20 Herman, J. L., Aschbacher, P. R., & Winters, L. (1992). A practical guide to alternative assessment. Alexandria, VA; ASCD.
21 성태제 (2002). 타당도와 신뢰도. 학지사.
22 김석우 (2007). 고등학교 과학과 수행평가 실태분석 및 개선방안. 교육평가연구, 20(4), 53-73.
23 Klein, S. P., Stecher, B. M., Shavelson, R., McCaffrey, D., Bell, R. M., Comfort, K., Othman, A. R., & Ormseth, T. (1998). Analytic versus holistic scoring of science performance tasks. Applied Measurement in Education, 11(2), 121-137.   DOI   ScienceOn
24 Linn, R. L., Baker. E. L., & Dunbar, S. B. (1991). Complex, performance-based assessment: Expectations and validation criteria. Educational Researcher, 20(8), 15-21.   DOI   ScienceOn
25 Etkina, E., Van Heuvelen, A., White- Brahmia, S., Brookes, D. T., Gentile, M., Murthy, S., Rosengrant, D., & Warren, A. (2006). Developing and assessing student scientific abilities. Physical Review Special Topics - Physics Education Research, 2(2), 020103-1-020103-15.   DOI
26 성태제 (2005). 문항반응이론의 이해와 적용. 교육과학사.
27 박정 (2001). 문항반응이론을 활용한 수행형 평가문항 분석방법. 교육학연구, 39(2), 215-232.
28 Hafner, J. C., & Hafner, P. M. (2003). Quantitative analysis of the rubric as an assessment tool: an empirical study of student peer-group rating. International Journal of Science Education, 25(12), 1509-1528.   DOI   ScienceOn
29 Halonen, J. S., Bosack, T., Clay, S., & McCarthy, M. (2003). A rubric for learning, teaching, and assessing scientific inquiry in psychology. Teaching of Psychology, 30(3), 196-208.   DOI   ScienceOn
30 김명숙 (1999). 영어작문 수행평가의 채점행위 분석 연구. 교육평가연구, 12(2), 25-54.
31 김경희, 송미영 (2001). 채점척도에 따른 채점자의 일관성과 피험자 능력 추정의 정확성 비교. 교육평가연구, 14(1), 327-347.