Search | Korea Science

Analysis of Assessment Types, Scoring Methods and Reliability of Science Performance Assessment in Middle and High School (중등학교 과학 수행평가의 평가 유형과 채점 방식 및 신뢰도 분석)

Lee, Ki-Young;An, Hui-Soo
- Journal of The Korean Association For Science Education
- /
- v.25 no.2
- /
- pp.173-183
- /
- 2005
In this study, we questioned what assessment types and scoring methods of science performance assessment(SPA) were being used in middle and high school, and how much these SPA scores were reliable(generalizable). To answer these questions, SPA data obtained from the seven schools were classified according to assessment type and scoring method. Based upon this classification, we analyzed the reliability by applying generalizability theory. The result, from the classification of assessment type and scoring method, showed that SPA types of the seven schools were divided into two types: paper-pencil type and task type. Paper-pencil type included answer(content)-restricted essay-type test solely. Task type has two parts: process and outcome assessment. As the results of analyzing scoring methods of the seven schools, there were two cases in the way of scoring methods: one case is scoring all essay-type items and performance tasks by one teacher, the other is scoring assigned performance tasks by two teachers. But the case of scoring assigned essay-type items or the case of cross scoring by two or more teachers were not found. The findings of the reliability analysis are as follows: (1) Effect of essay-type item to SPA score was larger than that of performance task. (2) There was remarkable difference among the seven schools' interaction effect of person and rater in scoring performance tasks. (3) Most of generalizability(reliability) coefficients of SPA for the seven schools were smaller than the acceptable generalizability coefficient(0.80). Therefore, the population of statistical parameters such as number of item, task and rater, should be increased for approaching the acceptable generalizability level.
PDF KSCI

Building an Automated Scoring System for a Single English Sentences (단문형의 영작문 자동 채점 시스템 구축)

Kim, Jee-Eun;Lee, Kong-Joo;Jin, Kyung-Ae
- The KIPS Transactions:PartB
- /
- v.14B no.3 s.113
- /
- pp.223-230
- /
- 2007
The purpose of developing an automated scoring system for English composition is to score the tests for writing English sentences and to give feedback on them without human's efforts. This paper presents an automated system to score English composition, whose input is a single sentence, not an essay. Dealing with a single sentence as an input has some advantages on comparing the input with the given answers by human teachers and giving detailed feedback to the test takers. The system has been developed and tested with the real test data collected through English tests given to the third grade students in junior high school. Two steps of the process are required to score a single sentence. The first process is analyzing the input sentence in order to detect possible errors, such as spelling errors, syntactic errors and so on. The second process is comparing the input sentence with the given answer to identify the differences as errors. The results produced by the system were then compared with those provided by human raters.
https://doi.org/10.3745/KIPSTB.2007.14-B.3.223 인용 PDF KSCI

A Study on Validity, Reliability and Practicality of a Concept Map as an Assessment Tool of Biology Concept Understandings (생물 개념 이해의 평가 도구로서 개념도의 타당도, 신뢰도 그리고 현실 적용 가능성에 대한 연구)

Cho, Jung-II;Kim, Jung
- Journal of The Korean Association For Science Education
- /
- v.22 no.2
- /
- pp.398-409
- /
- 2002
The purpose of this study was to investigate the validity, reliability and practicality of a concept map as an assessment tool in the context of biology concept learning. Forty undergraduate students participated in concept mapping, and the maps were scored by preservice science teachers, using one of three different scoring methods, that is, concept map scoring methods developed by Burry-Stock, Novak & Gowin and McClure & Bell. Two scorers were assigned to each scoring method. As far as the validity of the assessment methods was concerned, two of the three methods were found to be very valid, while Burry-Stock's scoring method was shown little valid. As far as the internal consistency of the methods was concerned, considerably high consistencies were shown between every pair of scorers, judging from high correlation coefficients between the pair of scorers for each scoring method. It took from 1.13 minutes to 3.70 minutes to assess a map at the average. It showed that concept mapping could be used in school classrooms with the limited resources of time and people. These findings suggest that the concept mapping can be an appropriate tool for assessing biology concept understandings.
PDF KSCI

Design and Implementation of Short-Essay Marking System by Using Semantic Kernel and WordNet (의미 커널과 워드넷을 이용한 주관식 문제 채점 시스템의 설계 및 구현)

Cho, Woo-Jin;Chu, Seung-Woo;O, Jeong-Seok;Kim, Han-Saem;Kim, Yu-Seop;Lee, Jae-Young
- Proceedings of the Korea Information Processing Society Conference
- /
- 2005.05a
- /
- pp.1027-1030
- /
- 2005
기존 의미커널을 적용한 주관식 채점 시스템은 여러 답안과 말뭉치에서 추출한 색인어들과의 상관관계를 벡터방식으로 표현하여 자연어 처리에 대한 문제를 해결하려 하였다. 본 논문에서는 기존 시스템의 답안 및 색인어의 표현 한계로 인한 유사도 계산오차 가능성에 대한 문제를 해결하고자 시소러스를 이용한 임의 추출 방식의 답안 확장을 적용하였다. 서술형 주관식 평가에서는 문장의 문맥보다는 사용된 어휘에 채점가중치가 높다는 점을 착안, 출제자와 수험자 모두의 답안을 동의어, 유의어 그룹으로 확장하여 채점 성능을 향상시키려 하였다. 우선 두 답안을 형태소 분석기를 이용해 색인어를 추출한 후 워드넷을 이용하여 동의어, 유의어 그룹으로 확장한다. 이들을 말뭉치 색인을 이용하여 단어들 간 상관관계를 측정하기 위한 벡터로 구성하고 의미 커널을 적용하여 정답 유사도를 계산하였다. 출제자의 채점결과와 각 모델의 채점 점수의 상관계수 계산 결과 ELSA 모델이 가장 높은 유사도를 나타내었다..
PDF

Automated Scoring System for Korean Short-Answer Questions Using Predictability and Unanimity (기계학습 분류기의 예측확률과 만장일치를 이용한 한국어 서답형 문항 자동채점 시스템)

Cheon, Min-Ah;Kim, Chang-Hyun;Kim, Jae-Hoon;Noh, Eun-Hee;Sung, Kyung-Hee;Song, Mi-Young
- KIPS Transactions on Software and Data Engineering
- /
- v.5 no.11
- /
- pp.527-534
- /
- 2016
The emergent information society requires the talent for creative thinking based on problem-solving skills and comprehensive thinking rather than simple memorization. Therefore, the Korean curriculum has also changed into the direction of the creative thinking through increasing short-answer questions that can determine the overall thinking of the students. However, their scoring results are a little bit inconsistency because scoring short-answer questions depends on the subjective scoring of human raters. In order to alleviate this point, an automated scoring system using a machine learning has been used as a scoring tool in overseas. Linguistically, Korean and English is totally different in the structure of the sentences. Thus, the automated scoring system used in English cannot be applied to Korean. In this paper, we introduce an automated scoring system for Korean short-answer questions using predictability and unanimity. We also verify the practicality of the automatic scoring system through the correlation coefficient between the results of the automated scoring system and those of human raters. In the experiment of this paper, the proposed system is evaluated for constructed-response items of Korean language, social studies, and science in the National Assessment of Educational Achievement. The analysis was used Pearson correlation coefficients and Kappa coefficient. Results of the experiment had showed a strong positive correlation with all the correlation coefficients at 0.7 or higher. Thus, the scoring results of the proposed scoring system are similar to those of human raters. Therefore, the automated scoring system should be found to be useful as a scoring tool.
https://doi.org/10.3745/KTSDE.2016.5.11.527 인용 PDF KSCI

An improvement of decathlon current scoring system (10종경기 점수체계의 개선)

Lee, Jang-Taek
- Journal of the Korean Data and Information Science Society
- /
- v.21 no.6
- /
- pp.1031-1039
- /
- 2010
The decathlon is an athletic event consisting of ten track and field events. Events are held over two consecutive days and the winners are determined by the combined performance in all. Performance is judged in meters, centimeters, minutes, and seconds. However, how to convert results into points is a difficult and controversial issue. We explored the distribution of decathlon results from the 1991 to 2009 using top 200 decathlons in the Olympic games and word championships. The conclusion is that the results from top level decathlon competition are normally distributed, and the current scoring system does not have the property that the performance with same difficulty should get same points. A new model for evaluating the decathlon score has been applied that display uniform characteristics over all events in order to meet the notion of allroundness. The proposed model is uniform over the events and support self-stabilization.
PDF KSCI

Effects of Consistency Criterion for Scoring on the Reliability and the Validity of Polygraph Test for Crime Suspects (범죄 용의자의 거짓말탐지검사의 신뢰도와 타당도에 대한 일관성 채점기준의 효과)

Han, Yu-Hwa;Jeong, Je-Young;Park, Kwang-Bai
- Science of Emotion and Sensibility
- /
- v.12 no.4
- /
- pp.557-564
- /
- 2009
For scoring polygraph charts, the Prosecutors' Office of the Republic of Korea uses a consistency criterion in which an elevated signal on one physiological channel is scored as a deceptive response only if the signal is also elevated on other channels. In the current study, the effects of this scoring criterion on reliability and accuracy (validity) of polygraph scores were assessed. Polygraph tests on 26 suspects were evaluated twice by the same examiners. The examiners used the consistency criterion in the first evaluation. In the second evaluation, the examiners were prevented from using the criterion; the signals from each physiological channel were separated and randomly arranged before they were rescored by the same examiner. Reliability was assessed by the variation among the scores for each suspect. Accuracy was assessed by establishing a standard, based on a Latent Class Analysis model, using the results of polygraph tests on each of 182 additional suspects. Reliability and accuracy were both improved by the use of the consistency criterion which therefore was recommended.
PDF

Proposal of Automated Essay Scoring Method based on Deep-Learning (딥러닝 기반의 에세이 자동 평가 방법 제안)

Kim, Yujin;Park, Chanjun;Lee, Seolhwa;Lim, HeuiSeok
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.384-390
- /
- 2021
본 논문은 영어 에세이 자동 평가를 위한 딥러닝 기반의 새로운 평가 방법론을 제안한다. 어휘, 형태소, 구문, 의미 단계로 이루어진 평가 과정을 통해 자동화된 에세이 평가가 가능하다. 제안하는 방법의 객관성과 신뢰성을 검증하기 위하여 사람이 평가한 점수와 각 단계별 점수 사이의 상관관계 분석을 진행하였으며, 그 결과 제안하는 평가 방법이 유의미함을 알 수 있었다.
PDF

A Grading System of Word Processor Practical Skill Using HWPML (HWPML을 이용한 워드프로세서 실기 채점 시스템)

Ha, Jin-Seok;Jin, Min
- Journal of The Korean Association of Information Education
- /
- v.7 no.1
- /
- pp.37-47
- /
- 2003
A grading system of practical word processor skills is designed and implemented by using HWPML(Hangul Word Processor Markup Language) which is a product of Hangul and Computer Co Ltd. By using HWPML, which is a markup tag structure of Hangul file, Hangul files can be edited in other application programs. Authorized users can make questions. However, only the manager is allowed to register answers to the questions in order to maintain the correctness of grading. The result of test is stored in the database and the statistics on pass or failure can be shown interactively. The number of taking test and scores for each user are stored in the database and they can be accessed to whenever the user wants them. Comments on the test results are provided by the manager so that learners can intensity their weak points.
PDF

The Analysis of the Ability to Control Variables and the Types of Controlling Variables by Junior High School Students (중학생들의 변인 통제 논리력과 변인 통제 유형 분석)

Lee, Yoon-Ha;Kang, Soon-Hee
- Journal of The Korean Association For Science Education
- /
- v.31 no.1
- /
- pp.32-47
- /
- 2011
The purpose of this study was to analyze the ability to control variables and the ways by which variables are controlled. First, the assessment criteria for evaluating the students' ability to control variables were developed for 8th grade students. Second, the ways variables are controlled were classified from student activity reports. These students' answers were categorized into six types (type A~ type F). Type A is defined as the group that excelled in recognizing the importance of controlling variables, eliminating unnecessary variables and identifying manipulated, dependent and controlled variables. Third, the scores of ability to control variables (CV score) and the classroom test of scientific reasoning (Lawson SRT) scores were measured. The results indicated that the CV score was highly correlated with Lawson SRT scores (r=.67, p<.01). Therefore, the assessment criteria developed in this study was used to evaluate the ability to control variables (CV score) and to measure the students' scientific reasoning.
https://doi.org/10.14697/jkase.2011.31.1.032 인용 PDF KSCI

Search Result 68, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)