• Title/Summary/Keyword: scoring methods

Search Result 649, Processing Time 0.031 seconds

An Analysis on Reliabilities of Scoring Methods and Rubric Ratings Number for Performance Assessments of Middle School Students' Science Investigation Activities (중학생 과학탐구활동 수행평가 시 채점 방식 및 척도의 수에 따른 신뢰도 분석)

  • Kim, Hyung-Jun;Yoo, June-Hee
    • Journal of The Korean Association For Science Education
    • /
    • v.30 no.2
    • /
    • pp.275-290
    • /
    • 2010
  • In this study, reliabilities of holistic scoring method and analytic scoring method were analyzed in performance assessments of middle school students' science investigation activity. Reliabilities of 2, 3, and 4~7-level rubric ratings for analytic scoring methods were compared to figure out optimized numbers of rubric ratings. Two trained raters rated four activity sheets of 60 students by two rating methods and three kinds of rubric ratings. Internal consistency reliabilities of holistic scoring methods were higher than those of analytic scoring methods, while intrarater reliabilities of analytic scoring were higher than those of holistic scoring methods. Internal consistency reliabilities and intra-rater reliabilities of 3-level rubric rating showed similar patterns of 4~7-level rubric ratings. But students' discriminations, item difficulties and item-response curves showed that the 3-level rubric ratings was reliable. These results suggest that holistic scoring method could be adapted to increase internal consistency reliabilities with improvement in intra-rater reliabilities by rater's conferences. Also, the 3-level rubric rating would be enough for good reliability in case of adapting analytic scoring methods.

Analysis of Assessment Types, Scoring Methods and Reliability of Science Performance Assessment in Middle and High School (중등학교 과학 수행평가의 평가 유형과 채점 방식 및 신뢰도 분석)

  • Lee, Ki-Young;An, Hui-Soo
    • Journal of The Korean Association For Science Education
    • /
    • v.25 no.2
    • /
    • pp.173-183
    • /
    • 2005
  • In this study, we questioned what assessment types and scoring methods of science performance assessment(SPA) were being used in middle and high school, and how much these SPA scores were reliable(generalizable). To answer these questions, SPA data obtained from the seven schools were classified according to assessment type and scoring method. Based upon this classification, we analyzed the reliability by applying generalizability theory. The result, from the classification of assessment type and scoring method, showed that SPA types of the seven schools were divided into two types: paper-pencil type and task type. Paper-pencil type included answer(content)-restricted essay-type test solely. Task type has two parts: process and outcome assessment. As the results of analyzing scoring methods of the seven schools, there were two cases in the way of scoring methods: one case is scoring all essay-type items and performance tasks by one teacher, the other is scoring assigned performance tasks by two teachers. But the case of scoring assigned essay-type items or the case of cross scoring by two or more teachers were not found. The findings of the reliability analysis are as follows: (1) Effect of essay-type item to SPA score was larger than that of performance task. (2) There was remarkable difference among the seven schools' interaction effect of person and rater in scoring performance tasks. (3) Most of generalizability(reliability) coefficients of SPA for the seven schools were smaller than the acceptable generalizability coefficient(0.80). Therefore, the population of statistical parameters such as number of item, task and rater, should be increased for approaching the acceptable generalizability level.

A study on the Severity Scoring Systems of Atopic Dermatitis ; Comparision, Analysis and Establishment (아토피 피부염의 평가방법에 대한 연구 : 비교 분석 및 설립)

  • 윤화정;윤정원;윤소원;고우신
    • The Journal of Korean Medicine
    • /
    • v.23 no.4
    • /
    • pp.15-26
    • /
    • 2002
  • There is much confusion in the field of atopic dermatitis (AD) regarding how to best measuredisease severity objectively. Therefore, we aimed to establish a new adequate scoring system for AD, that should be based on comparisonand analysis of various scoring systems. We report as follows. Methods: We searched for data relating to severity scoring systems for atopic dermatitis in Entrez PubMed From 1990 to 2001 Results and Conclusions: 1. Properties of severity scoring systems were validity, reliability, sensitivity of change and ease of use. 2. The essential items of severity scoring systems were extent. intensity and subjective symptoms. 3. The surface extent of the lesion was evaluated by the percentage of involvement of each of 10 areas. 4. The criteria of severity were divided into intensity and subjective symptoms. Intensity items are erythema, papulation, lichenification, oozing, dryness, excoriations, and pigmentation. The subjective symptom is pruritus, evaluated according to sleep loss. 5. The significant items of severity scoring system were symptomsrather than areas. As it were, we assumed extent accounted for around 30% of each total score, with intensity and subjective symptoms representing 70%.

  • PDF

Comparative Study of Exposure Potential and Toxicity Factors used in Chemical Ranking and Scoring System (화학물질 우선순위선정 시스템에서 고려되는 노출.독성인자 비교연구)

  • An, Youn-Joo;Jeong, Seung-Woo;Kim, Min-Jin;Yang, Chang-Yong
    • Environmental Analysis Health and Toxicology
    • /
    • v.24 no.2
    • /
    • pp.95-105
    • /
    • 2009
  • Chemical Ranking and Scoring (CRS) system is a useful tool to screen priority chemicals of large body of substances. The relative ranking of chemicals based on CRS system has served as a decision-making support tools. Exposure potential and toxicity are significant parameters in CRS system, and there are differences in evaluating those parameters in each CRS system. In this study, the parameters of exposure potential, human toxicity, and ecotoxicity were extensively compared. In addition the scoring methods in each parameter were analyzed. The CRS systems considered in this study include the CHEMS-1 (Chemical Hazard Evaluation for Management Strategies), SCRAM (Scoring and Ranking Assessment Model), EURAM (European Union Risk Ranking Method), ARET (Accelerated Reduction/Elimination of Toxics), and CRS-Korea. An comparative analysis of the several CRS systems is presented based on their assessment parameters and scoring methods.

A Study on design of The Internet-based scoring system for constructed responses (서답형 문항의 인터넷 기반 채점시스템 설계 연구)

  • Cho, Ji-Min;Kim, Kyung-Hoon
    • The Journal of Korean Association of Computer Education
    • /
    • v.10 no.2
    • /
    • pp.89-100
    • /
    • 2007
  • Scoring the constructed responses in large-scale assessments needs great efforts and time to reduce the various types of error in Paper-based training and scoring. For the purpose of eliminating the complexities and problems in Paper and pencil based training and scoring, many of countries including U.S.A and England already have applied online scoring system. There, however, has been few studies to develop the scoring system for the constructed responses items in Korea. The purpose of this study is to develop the basic design of the Internet-based scoring system for the constructed responses. This study suggested the algorithms for assigning scorers to constructed responses, employing methods for monitoring reliability, etc. This system can ensure reliable, quick scoring such as monitor scorer consistency through ongoing reliability checks and assess the quality of scorer decision making through frequent various checking procedures.

  • PDF

Machine Scoring Methods Highly-correlated with Human Ratings in Speech Recognizer Detecting Mispronunciation of Foreign Language (한국인의 외국어 발화오류검출 음성인식기에서 청취판단과 상관관계가 높은 기계 스코어링 기법)

  • Bae, Min-Young;Kwon, Chul-Hong
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.217-226
    • /
    • 2004
  • An automatic pronunciation correction system provides users with correction guidelines for each pronunciation error. For this purpose, we develop a speech recognition system which automatically classifies pronunciation errors when Koreans speak a foreign language. In this paper, we propose a machine scoring method for automatic assessment of pronunciation quality by the speech recognizer. Scores obtained from an expert human listener are used as the reference to evaluate the different machine scores and to provide targets when training some of algorithms. We use a log-likelihood score and a normalized log-likelihood score as machine scoring methods. Experimental results show that the normalized log-likelihood score had higher correlation with human scores than that obtained using the log-likelihood score.

  • PDF

Evaluation of Rotator Cuff Repair Using Korean Shoulder Scoring System

  • Shin, Sang-Jin;Lee, Juyeob;Ko, Young-Won;Park, Min-Gyue
    • Clinics in Shoulder and Elbow
    • /
    • v.18 no.4
    • /
    • pp.206-210
    • /
    • 2015
  • Background: Assessment of the clinical outcomes after rotator cuff repair is essential for their effectiveness on treatment. The Korean Shoulder and Elbow Society devised the Korean Shoulder Scoring System (KSS) for patients with rotator cuff disorder. The purpose of this study was to evaluate the availability of the KSS for assessment of clinical outcomes in patients after arthroscopic rotator cuff repair, and for comparison with other appraisal scoring systems. Methods: A total of 130 patients with partial-thickness or full-thickness rotator cuff tear who underwent arthroscopic repair using a single row or double row suture bridge technique were enrolled. The average follow-up period was 25.9 months. All patients were classified according to various factors. Comparison within corresponding categories was performed, and the correlation between the KSS and other shoulder assessment methods including University of California Los Angeles (UCLA), Constant and American Shoulder and Elbow Surgeons (ASES) score was analyzed. Results: Total score of the KSS response had increased from 59.6 preoperatively to 88.96 at last follow-up. All KSS domains, including function, pain, satisfaction, range of motion, and muscle power had improved up to 24 months postoperatively. Statistical significance was observed mainly in preoperative measurements with number and size of torn tendons, and greater than or equal to grade 3 of fatty infiltration. The KSS was best correlated with the UCLA scoring system in both preoperative (r=0.785) and postoperative (r=0.951) measurements. Conclusions: The KSS was highly reliable and valid as a discriminative instrument, and it showed strong correlation with ASES and UCLA scoring systems.

Scoring Methods of Polysomnography for Diagnosis of Sleep Apnea in Adolescents (청소년에서 수면 무호흡 진단을 위한 수면 다원 검사의 판독 방법)

  • Lee, Keu Sung;Sheen, Seung Soo;Lee, Il Jae;Choi, Byung-Joo;Choi, Ji Ho;Park, Do-Yang;Kim, Han Tai;Kim, Hyun Jun
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.61 no.11
    • /
    • pp.593-599
    • /
    • 2018
  • Background and Objectives Respiratory scoring guidelines for children and adults have been used for evaluating adolescents both in the 2007 and 2012 American Academy of Sleep Medicine (AASM) scoring manuals. We compared the scoring methods of polysomnography used in these scoring manuals, where pediatric and adult scoring rules were adopted for the diagnosis of sleep apnea in adolescents. Subjects and Method 106 Korean subjects aged between 13 and 18 years were enrolled. All subjects underwent overnight polysomnography in a sleep laboratory. Data were scored according to both pediatric and adult guidelines in the 2007 and 2012 AASM scoring manuals. Results Both pediatric and adult apnea hypopnea index (AHI) using the 2012 method were significantly higher than those using the 2007 method. The difference in AHI compared between pediatric and adult scores with the 2012 AASM scoring system was markedly decreased from that with the 2007 method. There was a significant discordance in sleep apnea diagnosis between pediatric and adult scoring rules in the 2012 method. Conclusion Both pediatric and adult rules were used for the diagnosis of adolescent sleep apnea in the 2012 method. However, there was significant discordance in the diagnosis between pediatric and adult scoring guidelines in the 2012 AASM manual, probably due to different cut-off values of AHI for the diagnosis of sleep apnea in pediatric (${\geq}1$) and adult (${\geq}5$) patients. Further studies are needed to determine a more reasonable cut-off value for the diagnosis of sleep apnea in adolescents.

Estimating Methods on Exponential Regression Models with Censored Data

  • Ha, Il-Do;Lee, Youngjo;Song, Jae-Kee
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.2
    • /
    • pp.195-210
    • /
    • 1999
  • We consider a large class of exponential regression models with censored data and propose two modified Fisher scoring methods with corresponding algorithms. These proposed methods improve the Newton-Raphson method in estimating the model parameters. The simulated and real examples are illustrated in aspect of convergence.

  • PDF

Usefulness of the Clock Drawing Test as a Cognitive Screening Instrument for Mild Cognitive Impairment and Mild Dementia: an Evaluation Using Three Scoring Systems

  • Kim, Sangsoon;Jahng, Seungmin;Yu, Kyung-Ho;Lee, Byung-Chul;Kang, Yeonwook
    • Dementia and Neurocognitive Disorders
    • /
    • v.17 no.3
    • /
    • pp.100-109
    • /
    • 2018
  • Background and Purpose: Although the clock drawing test (CDT) is a widely used cognitive screening instrument, there have been inconsistent findings regarding its utility with various scoring systems in patients with mild cognitive impairment (MCI) or dementia. The present study aimed to identify whether patients with MCI or dementia exhibited impairment on the CDT using three different scoring systems, and to determine which scoring system is more useful for detecting MCI and mild dementia. Methods: Patients with amnestic mild cognitive impairment (aMCI), vascular mild cognitive impairment (VaMCI), mild Alzheimer's disease (AD), mild vascular dementia (VaD), and cognitively normal older adults (CN) were included. All participants were administered the CDT, the Korean-Mini Mental State Examination (K-MMSE), and the Clinical Dementia Rating scale. The CDT was scored using the 3-, 5-, and 15-point scoring systems. Results: On all three scoring systems, all patient groups demonstrated significantly lower scores than the CN. However, while there were no significant differences among patients with aMCI, VaMCI, and AD, those with VaD exhibited the lowest scores. Area under the Receiver Operating Characteristic curves revealed that the three CDT scoring systems were comparable with the K-MMSE in differentiating aMCI, VaMCI, and VaD from CN. In differentiating AD from CN, however, the CDT using the 15-point scoring system demonstrated the most comparable discriminability with K-MMSE. Conclusions: The results demonstrated that the CDT is a useful cognitive screening tool that is comparable with the Mini-Mental State Examination, and that simple CDT scoring systems are sufficient for differentiating patients with MCI and mild dementia from CN.