• Title/Summary/Keyword: Inter-Rater reliability

Search Result 197, Processing Time 0.026 seconds

The Development of Assessment Tool on Student's Character Competence based on Collaborative Problem-Solving Instruction Model (협력적 문제해결 중심 교수모델에 기반 한 학생 인성 역량 평가 도구 개발)

  • Jeon, RanYeong;Kim, HeeHwa;Nam, Jeonghee;Kang, EuGene;Son, Jeongwoo;Park, Jongseok
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.3
    • /
    • pp.419-430
    • /
    • 2018
  • The purpose of this study is to develop an assessment tool to evaluate student character competence in applying a Collaborative Problem-Solving Instruction Model in science education. Through the literature analysis, nine elements of character are extracted. They are: openness, empathy, tolerance, caring, integrity, self-regulation, honesty, responsibility, and cooperation. Based on these existing measures of character competence, experts discussed and developed items for evaluation of a student's character competence based on the Collaborative Problem-Solving Instruction Model. The first 88 preliminary items were investigated, corrected, and supplemented based on the results of the first survey. A second validity survey was conducted on 71 middle- and high school science teachers to determine the content validity of the items. Inter-rater reliability was calculated for the assessors to verify the reliability of the items. Overall, the inter-rater reliability and content validity of the assessment items are good with 53 items ultimately being selected based on the analysis results. The assessment tool developed in this study could be used to explore changes in student character competence through a Collaborative Problem-Solving Instruction Model, as well as to evaluate student character competence in science education.

Rating criteria to evaluate student performance in digital wax-up training using multi-purpose software

  • Mino, Takuya;Kurosaki, Yoko;Tokumoto, Kana;Higuchi, Takaharu;Nakanoda, Shinichi;Numoto, Ken;Tosa, Ikue;Kimura-Ono, Aya;Maekawa, Kenji;Kim, Tae Hyung;Kuboki, Takuo
    • The Journal of Advanced Prosthodontics
    • /
    • v.14 no.4
    • /
    • pp.203-211
    • /
    • 2022
  • PURPOSE. The aim of this study was to introduce rating criteria to evaluate student performance in a newly developed, digital wax-up preclinical program for computer-aided design (CAD) of full-coverage crowns and preliminarily investigate the reliability and internal consistency of the rating system. MATERIALS AND METHODS. This study, conducted in 2017, enrolled 47 fifth-year dental students of Okayama University Dental School. Digital wax-up training included a fundamental practice using computer graphics (CG), multipurpose CAD software programs, and an advanced practice to execute a digital wax-up of the right mandibular second molar (#47). Each student's digital wax-up work (stereolithography data) was evaluated by two instructors using seven qualitative criteria. The total qualitative score (0-90) of the criteria was calculated. The total volumetric discrepancy between each student's digital wax-up work and a reference prepared by an instructor was automatically measured by the CAD software. The inter-rater reliability of each criterion was analyzed using a weighted kappa index. The relationship between the total volume discrepancy and the total qualitative score was analyzed using Spearman's correlation. RESULTS. The weighted kappa values for the seven qualitative criteria ranged from 0.62 - 0.93. The total qualitative score and the total volumetric discrepancy were negatively correlated (ρ = -0.27, P = .09, respectively); however, this was not statistically significant. CONCLUSION. The established qualitative criteria to evaluate students' work showed sufficiently high inter-rater reliability; however, the digitally measured volumetric discrepancy could not sufficiently predict the total qualitative score.

Reliability and Validity of the Postural Balance Application Program Using the Movement Accelerometer Principles in Healthy Young Adults

  • Park, Seong-Doo;Kim, Ji-Seon;Kim, Suhn-Yeop
    • Physical Therapy Korea
    • /
    • v.20 no.2
    • /
    • pp.52-59
    • /
    • 2013
  • The purpose of this study was to determine the reliability and validity of the postural balance program which uses the movement accelerating field principles of posture balance training and evaluation equipment and smartphone movement accelerometer program (SMAP) in healthy young adults. A total of 34 people were appointed as the subject among the healthy young adults. By using Biodex stability system (BSS) and SMAP on the subject, the posture balance capability was evaluated. For the test-retest reliability, SMAP showed the intra-class correlation (ICC: .62~.91) and standard error measurement (SEM: .01~.08). BSS showed the moderate to high reliability of ICC (.88~.93) and SEM (.02~.20). In the reliability of inter-rater, ICC (.59~.73) as to SMAP, showed the reliability of moderate in eyes open stability all (EOSA), eyes open stability anterior posterior (EOSAP), eyes open stability medial lateral (EOSML) and eyes open dinamic all (EODA), eyes open danamic anterior posterior (EODAP), and eyes open danamic medial lateral (EODML). However, ICC showed reliability which was as low as .59 less than in other movements. In addition, BSS showed the reliability of high as ICC (.70~.75). It showed reliability which was as low as ICC (.59 less than) in other movements. In correlation to the balance by attitudes between SMAP and BSS, EOSML (r=.62), EODA (r=.75), EODML (r=.72), ECDAP (r=.64), and ECDML (r=.69) shown differ significantly (p<.05). However, the correlation noted in other movements did not differ significantly. Therefore, SMAP and BSS can be usefully used in the posture balance assessment of the static and dynamic condition with eyes opened and closed.

Multiple Average Ratings of Auditory Perceptual Analysis for Dysphonia

  • Choi, Seong-Hee;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.165-170
    • /
    • 2009
  • This study was to investigate for comparison between single rating and average ratings from multiple presentations of the same stimulus for measuring the voice quality of dysphonia using 7-point equal-appearing interval (EAI) rating scale. Overall severity of voice quality for 46 /a/ vowel stimuli (23 stimuli from dysphonia, 23 stimuli from control) was rated by 3 experienced speech-language pathologists (averaged 19 years; range = 7 to 40 years). For average ratings, each stimulus was rated five times in random order and averaged from two to five times. Although higher inter-rater reliability was found in average ratings than in single rating, there were no significant differences in rating scores between single and multiple average ratings judged by experienced listeners, suggesting that auditory perceptual ratings judged by well-trained listeners have relatively good agreement with the same stimulus across the judgment. Larger variations in perceptual ratings were observed for moderate voices than for mild or severe voices, even in the average ratings.

  • PDF

Inter-Rater Reliability of the Gross Motor Performance Measure (대동작 운동 수행능력 측정 도구의 측정자간 신뢰도)

  • Yi, Chung-Hwi;Park, So-Yeon;Ko, Myung-Suk
    • Physical Therapy Korea
    • /
    • v.10 no.4
    • /
    • pp.17-22
    • /
    • 2003
  • 대동작 운동 수행능력 측정도구(GMPM)는 뇌성마비 아동의 움직임을 질적인 면에서 평가하기 위해 개발된 도구이다. 이 연구의 목적은 대동작 운동 수행능력 측정도구의 측정자간 신뢰도를 알아보는 것이다. 뇌성마비 아동 10명(평균 5.6세, 범위 4~8세)에게 GMPM 평가를 실시하였다. 평가 과정을 비디오로 녹화하여 각 속성 항목별로 3명의 평가자간의 급간내 상관계수로 일치도를 보았다. 전반적으로 측정자간 신뢰도는 '불량~보통'범주에 속했다. 이 연구의 결과는 충분한 교육을 받지 않고 평가하면 그 결과를 신뢰하기 어렵다는 것을 말해준다. 향후 임상에서 GMPM을 이용하여 평가할 때 측정자간 신뢰도에 어떤 변화가 있는지 알아보는 연구가 필요하다.

  • PDF

Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text

  • Tissera, Muditha;Weerasinghe, Ruvan
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.2
    • /
    • pp.113-124
    • /
    • 2022
  • News in the form of web data generates increasingly large amounts of information as unstructured text. The capability of understanding the meaning of news is limited to humans; thus, it causes information overload. This hinders the effective use of embedded knowledge in such texts. Therefore, Automatic Knowledge Extraction (AKE) has now become an integral part of Semantic web and Natural Language Processing (NLP). Although recent literature shows that AKE has progressed, the results are still behind the expectations. This study proposes a method to auto-extract surface knowledge from English news into a machine-interpretable semantic format (triple). The proposed technique was designed using the grammatical structure of the sentence, and 11 original rules were discovered. The initial experiment extracted triples from the Sri Lankan news corpus, of which 83.5% were meaningful. The experiment was extended to the British Broadcasting Corporation (BBC) news dataset to prove its generic nature. This demonstrated a higher meaningful triple extraction rate of 92.6%. These results were validated using the inter-rater agreement method, which guaranteed the high reliability.

Reliability of Measured Popliteal Angle by Traditional and Stabilized Active-Knee-Extension Test

  • Kim, Min-Hee;Kim, Yong-Wook;Jung, Doh-Heon;Yi, Chung-Hwi
    • Physical Therapy Korea
    • /
    • v.16 no.4
    • /
    • pp.1-7
    • /
    • 2009
  • The active-knee-extension (AKE) test has been used to measure hamstring muscle length. The traditional AKE test measures the popliteal angle to the point of resistance with a 90-degree flexion of the hip fixed by straps, while the stabilized AKE test measures the popliteal angle to the point of resistance with a 90-degree flexion of the hip stabilized using a pressure biofeedback unit providing lumbopelvic stabilization. The purpose of this study was to determine test-retest reliability of the traditional AKE test and stabilized AKE test. Twenty healthy adults participated in the study. The popliteal angles were measured with a digital inclinometer during each test. To assess the test-retest reliability between the 2 test sessions, intraclass correlation coefficients (ICCs) were calculated. The intrasubject coefficient of variation ($CV_{intra}$) was also calculated. To compare the traditional and stabilized AKE tests for changes in pressure, paired t-tests were applied. The results of this study were as follows: 1) ICCs(3,1) value for test-retest reliability was .96 in the traditional AKE test, and was .98 in the stabilized AKE test. 2) The maximal $CV_{intra}$ was 33.7% in the traditional AKE test and 15.7% in the stabilized AKE test. 3) Differences of $6.1{\pm}2.1$ mmHg in pressure were measured in the traditional AKE test, and differences of $1.2{\pm}1.0$ mmHg in pressure were measured in the stabilized AKE test. The results show the traditional and stabilized AKE test to be highly reliable, with test-retest reliability. However, the stabilized AKE test represented less variation and more stabilization than the traditional AKE test. Further study is needed to measure the inter-rater reliability of the stabilized AKE test for generalization and clinical application.

  • PDF

Analysis of Clinical Indicators related to Pattern-Identification in Acute Cerebral Infarction Patient (급성기 뇌경색 환자에 있어 변증형별 유의한 임상지표의 분석)

  • Lee, Eun-chan;Hyun, Sang-ho;Kwak, Seung-hyuk;Woo, Su-kyung;Park, Ju-young;Jung, Woo-sang;Moon, Sang-kwan;Cho, Ki-ho;Park, Sung-wook;Ko, Chang-nam
    • The Journal of the Society of Stroke on Korean Medicine
    • /
    • v.13 no.1
    • /
    • pp.33-42
    • /
    • 2012
  • Object : The aim of this study was to assess the clinical indicators related to Pattern-Identification(PI) in acute cerebral infarction patients. Methods : We studied hospitalized patients within 30days after ictus, who admitted at Korean Medicine Center of Kyung-Hee University from January 2010 to October 2012.(n=290) Two Traditional Korean Medicine(TKM) physicians evaluated the patients independently and diagnosed PI. Inter-rater reliability was measured using simple percentage agreement and the Cohen's kappa(κ) coefficient. To assess the clinical indicators closely related to each PI, we analysed average score of each indicator in each group. Results : Simple percentage agreement of PI between raters was 64.83% and Cohen's kappa(κ) coefficient was 0.526(95% CI: 0.451-0.600). Inter-rater reliability level was fair to good. We analysed the clinical indicators in each group. Significant indicators for Fire-Heat Pattern(FHP) were reddened complexion and strong pulse power, and meaningful indicators for FHP were halitosis and thick tongue fur. Significant indicator for Dampness-Phlegm Pattern(DPP) was overweight and there was no meaningful indicator. Significant indicator for Yin-Deficiency Pattern(YDP) was dry tongue fur and meaningful indicator for YDP was thirst. There was no significant indicator for Qi-Deficiency Pattern(QDP) and pale complexion and faint low voice were meaningful indicators for QDP. Conclusions : This study reveals the significant and meaningful clinical indicators related to each Pattern-Identification in acute cerebral infarction patients. It will contribute to standardization of Korean Medical Diagnosis and Treatment in acute cerebral infarction patients.

  • PDF

Real Time Versus Photographic Assessment of Stool Consistency Using the Brussels Infant and Toddler Stool Scale: Are They Telling Us the Same?

  • Aman, Berthold Albert;Levy, Elvira Ingrid;Hofman, Benjamine;Vandenplas, Yvan;Huysentruyt, Koen
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.24 no.1
    • /
    • pp.38-44
    • /
    • 2021
  • Purpose: Digital communication is becoming increasingly important in clinical practice and research. The finding that stool consistency can be evaluated similarly using either "in vivo" or photographic material by health care professionals will decrease subjective interpretation by parents. The primary outcome of this study was the reliability of stool consistency scoring using the Brussels Infant and Toddler Stool Scale (BITSS) between fresh stools and their photos; the secondary outcome was the inter-rater reliability based on the fresh stools. Methods: Fresh stool samples from healthy children were collected in a day care center. These stools, and one month later the corresponding photos presented in a random order, were presented to 14 observers. Reliabilities were analyzed using absolute agreements and weighted and unweighted Cohen's κ. Results: In total, 202 samples were rated 576 times. Absolute agreement between photographic and real time assessment ranged between 71.1% and 83.3% among observers. This corresponded with substantial agreement (unweighted κ=0.70 [95% CI, 0.61-0.78]; weighted κ=0.86 [95% CI, 0.78-0.88]). The inter-observer agreement showed similar percentages of absolute agreement (81.4-82.0%) and κ-values corresponding with fair-to-moderate agreement. Conclusion: Our findings suggest that the assessment of fresh stool consistency can also reliably be done on photographic material when using the BITSS. This opens opportunities in scientific surroundings and in our daily life communication with parents and caretakers.

The Reliability and Validity of Patient-Generated Subjective Global Assessment (PG-SGA) in Stroke Patients (뇌졸중 환자에서 '환자 주도적 총체적 영양사정' 도구의 신뢰도 및 타당도 평가)

  • Yoo, Sung-Hee;Oh, Eui-Guem;Youn, Mi-Jung
    • Korean Journal of Adult Nursing
    • /
    • v.21 no.6
    • /
    • pp.559-569
    • /
    • 2009
  • Purpose: This study was to examine the reliability and validity of Patient-Generated Subjective Global Assessment (PG-SGA) as a nutritional measurement for stroke patients. Methods: This was a methodological study performed from May 6 to June 10, 2009 at a tertiary university hospital in Seoul. For reliability of PG-SGA, inter-rater reliability was used for statistics. For concurrent validity, BMI and biomarkers were compared between PG-SGA 0 ~ 8 and ${\geq}$ 9. In addition, sensitivity, specificity, and predictive value of PG-SGA compared with SGA were calculated using a contingency table. For predictive validity, hospital day, complications, and readmission within 1-month after discharge were compared between PG-SGA 0 ~ 8 and ${\geq}$ 9. Results: Correlation of PG-SGA score between two observers was 0.83, and kappa value for the agreement of severe malnutrition was 0.78(all $p_s$ < .001). The scored PG-SGA showed high sensitivity and specificity (100% and 96.7%, respectively). Severe undernourished patients (PG-SGA ${\geq}$ 9) had significantly low TLC, protein, albumin, and prealbumin (all $p_s$ < .01) compared with non-undernourished patients (PG-SGA 0 ~ 8). Also, in severe undernourished patients, complications and readmission (all $p_s$ = 0.01) were more often represented, and hospital days (p = .013) were significantly delayed. Conclusion: PG-SGA is a reliable and valid measurement to assess nutritional status for stroke patients.

  • PDF