• Title/Summary/Keyword: Cohen's Kappa value

Search Result 20, Processing Time 0.021 seconds

A New Measure of Agreement to Resolve the Two Paradoxes of Cohen's Kappa (COHEN의 합치도의 두 가지 역설을 해결하기 위한 새로운 합치도의 제안)

  • Park, Mi-Hee;Park, Yong-Gyu
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.117-132
    • /
    • 2007
  • In a $2\times2$ table showing binary agreement between two raters, it is known that Cohen's $\kappa$, a chance-corrected measure of agreement, has two paradoxes. $\kappa$ is substantially sensitive to raters' classification probabilities(marginal probabilities) and does not satisfy conditions as a chance-corrected measure of agreement. However, $\kappa$ and other established measures have a reasonable and similar value when each marginal distribution is close to 0.5. The objectives of this paper are to present a new measure of agreement, H, which resolves paradoxes of $\kappa$ by adjusting unbalanced marginal distributions and to compare the proposed measure with established measures through some examples.

Comparison between denture wearer's evaluation and clinician's rating for complete denture (총의치 사용에 대한 환자와 술자간 평가 비교)

  • Byun, Jin-Soo;Huh, Yoon-Hyuk;Cho, Lee-La;Park, Chan-Jin
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.54 no.4
    • /
    • pp.364-369
    • /
    • 2016
  • Purpose: The aim of this study was to compare denture wearer's evaluation and clinician's technical rating for complete denture used on edentulous patients. Materials and methods: Total 43 edentulous patients who had complete denture fabricated more than one year ago were recalled. The questionnaire based on the various literatures was modified and applied to patients for subjective assessments. Functional aspects related to retention, stability, occlusion and denture condition were included in operator's evaluation. In addition, correlations were evaluated between patient's subjective and operator's objective assessments. Friedman test and Cohen's Kappa value were used for statistical analysis. Results: It was found that denture wearers' evaluations were slightly or fairly agree to clinician's rating for complete denture. More differences were found in maxillary denture than mandibular denture and moderate difference was found in esthetic, occlusion aspects. Conclusion: There were slightly or fairly agreement between subjective and objective evaluations.

Reliability and Validity of the Side-lying Instability and Prone Instability Tests in Patients with Lumbar Segmental Instability

  • Kim, Bo-Eon;Lee, Kwan-Woo;Park, Dae-Sung
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.16 no.1
    • /
    • pp.1-7
    • /
    • 2021
  • PURPOSE: The purpose of this study is to conduct inter-rater and intra-rater reliability tests in patients with low back pain (LBP) using the prone instability test (PIT) and side-lying instability test (SIT). We have analyzed the Korean version Oswestry disability index (K-ODI) correlations and radiograph finding (RF) for validity. METHODS: Individuals (n = 51) (mean age of 40.27 ± 13.28) with LBP for at least over a week were recruited, together with two participating physical therapist examiners. The measurement consisted of PIT, PST, K-ODI, and RF. Sensitivity (Sn), specificity (Sp), positive predictive value, negative predictive value, prevalence index, agreement %, Cohen's kappa, and prevalence-adjusted bias-adjusted kappa (PABAK) were calculated. The PIT and SIT were compared with RF for validity analysis, while PIT, SIT, K-ODI, and RF were calculated for the correlation analysis. RESULTS: The intra-rater reliability test measured for the PIT (kappa = .79, PABAK = .88) and SIT (kappa = .73, PABAK = .84), and inter-rater reliability test measured for the SIT (kappa = .80, PABAK = .88) showed good agreements. The PIT (Sn = .65, Sp = .63) and SIT validities (Sn = .68, Sp = .70) were compared with RF, showing a significant correlation in PIT and RF (r = .69), SIT and RF (r = .73), and PIT and K-ODI (r = .53). CONCLUSION: The SIT is a more comfortable position test than the PIT in patients. Both PIT and SIT have acceptable reliability and validity.

Reliability of Modified Ashworth Scale Using a Haptic Robot Finger Simulating Finger Spasticity (손가락 경직을 모사하는 로봇 시뮬레이터를 이용한 경직도 검진의 신뢰도 평가)

  • Ha, Dokyeong;Park, Hyung-Soon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.41 no.2
    • /
    • pp.125-133
    • /
    • 2017
  • This paper presents the inter-rater reliability of finger spasticity assessment tested realized by using finger simulator that mimics finger spasticity of patients after a stroke. For controlling the simulator torque, finger spasticity was modeled, and the model parameters were obtained by measuring quantitative data while grading based on Modified Ashworth Scale (MAS). A robotic finger simulator was designed for mimicking finger spasticity. Evaluation of this simulator with the help of seven rehabilitation doctors showed that the simulator had a Cohen's kappa value of 0.619 for Metacarpophalangeal Joint and 0.514 for Proximal Interphalangeal Joint. Fleiss' kappa between raters is 0.513 for Metacarpophalangeal Joint and 0.486 for Proximal Interphalangeal Joint. Therefore, the spasticity assessment made by MAS grade system is not reliable owing to the subjectivity of the assessment. The proposed robotic simulator can be used as a training tool for improving the reliability of the spasticity assessment.

The Value of Periapical Radiograph in the Diagnosis of Interproximal Caries (구내방사선사진의 인접면 치아우식 진단에 대한 유용성 평가)

  • Kim Young-Hee;Kang Byung-Cheol
    • Imaging Science in Dentistry
    • /
    • v.30 no.1
    • /
    • pp.49-54
    • /
    • 2000
  • Purpose : To compare the diagnostic performance of clinical and radiologic examination for the interproximal caries on intraoral periapical radiographs and to evaluate the value of periapical radiographs. Methods: One hundred seven dental patients were examined clinically, with a mouth mirror and an explorer, by a dentist at the department of oral medicine, and the presence or absence of interproximal caries lesion was recorded. The patients were prescribed one or more dental periapical radiographs. Radiographs were assessed for the presence of interproximal caries by three oral and maxillofacial radiologists independantly. Two thousand sixty interproximal surfaces were included in this study. The diagnostic accuracies of clinical and radiologic examinations for interproximal caries were calculated. To assess the degree of agreement between clinical and radiologic examinations, Cohen's coefficient of agreement was computed. Results: The specificity of clinical and radiologic examination was 0.991, 0.997 and the sensitivity was 0.279, 0.985 respectively. The diagnostic accuracy of radiologic examination was statistically significantly higher than that of clinical examination (P<0.05). Cohen's kappa value of clinical and radiologic examination was 0.335, 0.942 respectively. These results suggested that clinical examination show only fair agreement, whereas radiologic examination show perfect agreement. Conclusion: The diagnositic performance of the dental periapical radiographs on interproximal caries were higher than that of clinical examination, thus this study showed the validity of periapical radiographs for detecting interproximal caries lesion without bitewing radiograph.

  • PDF

Validity of Self-reported Smoking Using Urinary Cotinine among Vocational High School Students

  • Park, Soon-Woo;Kim, Jong-Yeon
    • Journal of Preventive Medicine and Public Health
    • /
    • v.42 no.4
    • /
    • pp.223-230
    • /
    • 2009
  • Objectives : This study was conducted to validate self-reported smoking among high school students using urinary cotinine. Methods : A self report of smoking behavior was collected together with urine sample for cotinine analysis from 130 male and female students in two vocational high school students in November, 2007. Validity and agreement between self-reported smoking and urinary cotinine was analyzed with STATA 9.0 for different definitions of current smokers, and frequent and daily smokers. Urinary cotinine concentration was measured by the DRI Cotinine Assay for urine (Microgenics Corp., Fremont, CA) on Toshiba 200FR. The cut-off point of urinary cotinine was 50 ng/dl. Results : The concentrations of urinary cotinine were significantly different according to the frequency and amount of smoking. Sensitivity and specificity was 90.9% and 91.8% respectively, and the Cohen s kappa value was 0.787 among the current smokers who smoked at least one day during one month preceding the survey. The comparable high sensitivity, specificity, and kappa value were shown also among the other definitions of current smokers, that is, subjective smokers, and weekly smokers. Conclusions : The results showed the high validity of self-reported smoking among high school students. However, due to the small sample size and limitation of the participants, it is cautious to generalize the results to overall high school students.

Inter-rater Reliability Study on Pattern Identification Using Nasal Endoscopy for Rhinitis (비내시경 활용 비염 변증 지표의 평가자 간 신뢰도 연구)

  • Min, Kyung-Jin;Son, Mi-Ju;Kim, Young-Eun;Kim, Jeong-Hun;Lee, Dong-Hyo
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.30 no.4
    • /
    • pp.97-103
    • /
    • 2017
  • Objectives : To identify whether pattern identification using nasal endoscopy for rhinitis can be applied as a tool for evaluating rhinitis in routine care setting, we performed a inter-rater reliability study on this pattern identification. Methods : Two Korean medicine doctors assessed 290 left/right nasal endoscopy photograph cases of rhinitis patients with pattern identification using nasal endoscopy. This pattern identification consist of four assessment items, nasal membrane color(pale/hyperemia), nasal membrane humidity(dryness/dampness), rhinorrhea(watery/yellow), and turbinate membrane edema(atrophic/edematous). Cohen's kappa statistic and Percentage agreement were used to evaluate the inter-rater reliability. Results : Inter-rater percentage agreement and Kappa coefficient for left nasal endoscopy photograph cases was from 'slight' to 'moderate'(% agreement: 40.00-67.59%/Kappa: 0.06-0.407). Only the agreement of 'rhinorrhea (watery/yellow)' item was moderate(% agreement: 67.59%/Kappa: 0.407). Inter-rater percentage agreement and Kappa coefficient for right nasal endoscopy photograph cases was also from 'slight' to 'moderate'(% agreement: 42.41-68.97%/Kappa: 0.109-0.465). Only the agreement of 'rhinorrhea(watery/yellow)' item was moderate(% agreement: 68.97%/Kappa: 0.465). Conclusions : It is necessary to resolve problems such as cut-off value setting, bipolar evaluation values(pale/hyperemia, dryness/dampness, watery/yellow, atrophic/edematous) and weighting items. Further rigorous studies that overcome the limitations of the current research are warranted.

A Validation Study of the Abbreviated Self-Rated Korean Version of MINI (MINI Patient Health Survey) (한국판 단축된 자기보고형 MINI (MINI 정신건강 평가)의 타당도 연구)

  • Lim, Se-Won;Song, Han-Soo;Oh, Yun-Hee;Shin, Ho-Chul;Cho, Kwang-Hyun;Chung, Sang-Keun;Oh, Kang-Seob
    • Anxiety and mood
    • /
    • v.3 no.1
    • /
    • pp.32-40
    • /
    • 2007
  • Objectives : To investigate the validity of an abbreviated self-rated Korean version of MINI (Mini International Neuropsychiatric Interview) patient health survey which screening social anxiety disorder, panic disorder, generalized anxiety disorder, and major depressive disorder. Methods : 115 subjects completed MINI and MINI patient health survey. The validity of MINI patient health survey was assessed by whether the results from MINI patient health survey were compatible with the results from MINI or not. The Cohen's kappa value, specificity, sensitivity, positive predictive value, and negative predictive value was calculated for this purpose. Results : The Kappa values of social anxiety disorder (0.60), panic disorder (0.49), generalized anxiety disorder (0.60) and major depressive disorder without other co-morbid disorder (0.59) were at least moderate in strength of agreement. Conclusion : The abbreviated self-rated Korean version of MINI patient health survey has the moderate to good validity in social anxiety disorder, panic disorder, generalized anxiety disorder, and major depressive disorder without other co-morbid disorders. Our result suggests that this instrument might be useful for screening above 4 disorders if it is used under careful supervision of experienced clinicians.

  • PDF

Development Cut-off Value for Yin-deficiency Questionnaire and Diagnostic Ability of Yin-deficiency in Xerostomia (구강건조증 환자에서 음허 측정 설문지 절단점 개발 및 진단능 평가)

  • Jang, Seung-Won;Kim, Jin-Sung
    • The Journal of Internal Korean Medicine
    • /
    • v.35 no.4
    • /
    • pp.483-497
    • /
    • 2014
  • Objectives: The aims of study were developing cut-off value of Yin-deficiency questionnaire (YDQ) for diagnosis of Yin-deficiency (YD) and compare diagnostic ability between YDQ and Yin-deficiency scale score (YDS) in xerostomia patients. Methods: We recruited 58 xerostomia patients. They were diagnosed YD or non-YD by 3 Korean medicine doctors (KMD). We assessed YD using YDQ and YDS. We evaluated xerostomia using VAS, Dry Mouth Symptom Questionnaire (DMSQ), Salivary Flow Rate (SFR), oral moisture on buccal mucosa and tongue surface (OMB and OMT). We surveyed tongue coatings using Winkel Tongue Coating Index (WTCI). Results: We diagnosed 23 patients YD and 35 patients non-YD. There were no significant differences of age, sex and body mass index between the YD and non-YD groups. Using receiver operating characteristic curve analysis, the optimal cut-off value of YDQ was defined as 304. Sensitivity, specificity and Youden index of YDQ were 86.96%, 71.43% and 1.5839 respectively. Using Cohen's coefficient of agreement, we found that degree of agreement between KMD and YDQ diagnosis was moderate (${\kappa}$=0.524, p<0.001). Using Pearson's correlation analysis, we found concurrent validity of YDQ and YDS were significant correlated. Using area under curve value, we found diagnostic ability between YDQ and YDS were not significantly different (p=0.505), but there were more strong correlations between DMSQ-symptoms and YDQ (r=0.731, p<0.001) than correlations between DMSQ-symptoms and YDS (r=0.418, p<0.01). Conclusions: The cut-off value of YDQ can diagnose YD in xerostomia and diagnostic ability of YDQ in xerostomia is better than YDS.

Diagnostic Performance of Simulated Abbreviated MRI for Early-Stage Hepatocellular Carcinoma Screening: A Comparison to Conventional Dynamic Contrast-Enhanced MRI (초기 간암 선별 검사로서 단축 자기공명영상 검사의 진단능: 고식적 역동학적 자기공명영상검사와의 비교)

  • Eun Sol Lim;Sung Mo Kim;Sang Soo Shin;Suk Hee Heo;Jong Eun Lee;Yong Yeon Jeong
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.5
    • /
    • pp.1218-1230
    • /
    • 2021
  • Purpose To compare the per-patient diagnostic performance of simulated abbreviated MRI (AMRI) to that of conventional MRI (CMRI) with full-sequence dynamic gadoxetic acid (GA) enhancement for early-stage hepatocellular carcinoma (HCC) screening in high-risk patients. Materials and Methods A total of 201 consecutive patients at high-risk for HCC, who underwent 3T liver MRI, were included in this retrospective study. The AMRI protocol comprised T2-weighted imaging, hepatobiliary phase imaging after GA injection, and diffusion-weighted imaging. For each patient, two AMRI and CMRI image sets were independently reviewed by two radiologists. Inter-reader agreement was assessed using Cohen's kappa value. A composite reference standard was used to determine the diagnostic performance of each image set for each reader. Results A total of 93 HCCs were detected in 79 patients. The inter-reader agreement was almost perfect for both image sets (κ = 0.839, 0.948). In AMRI, the per-patient sensitivity and negative predictive values (NPV) were 94.9% and 96.4%, respectively. In CMRI, the per-patient sensitivity and NPV were 96.2% and 97.5%, respectively. Conclusion AMRI, using only three sequences, had a comparable diagnostic performance to CMRI in screening early-stage HCC. AMRI could be an alternative HCC screening tool for high-risk HCC patients.