• Title/Summary/Keyword: Inter-Rater reliability

Search Result 197, Processing Time 0.019 seconds

A Method for Developing Items to Assess Earth Science Creativity (지구과학 창의력 평가 문항 개발 방법에 관한 연구)

  • Lee, Hang-Ro
    • Journal of the Korean earth science society
    • /
    • v.24 no.3
    • /
    • pp.150-159
    • /
    • 2003
  • This study suggests methods of assessing scientific creativity and developing items, which can be achieved when both earth science knowledge and general creativity are applied at the same time. According to the results of this study, the cognitive ability gaps between creativity and scientific creativity were clearly defined by the terms' operational definition. Four factors in the Subcategory Of Scientific Creativity-fluency, flexibility, elaboration, and originality-were selected, and the possibility of developing items out of these factors was discovered. The operational definitions of the four factors were given and the criteria for assessment and scoring were set. The validity, reliability, discrimination, and difficulty, which were the conditions required for the assessment instruments, were verified through three field trials of inputting the assessment instruments for scientific creativity. The assessment instruments were composed of 8 items with 2items for each factor. The average item fitness index obtained was 0.99, Cronbach , the item inter-consistency was 0.79,the inter-rater reliability of each item was 0.78, the inter-rater reliability of each factor was 0.75, the item discrimination power was 0.19, and the item difficulty was 0.00. Because the results were within the permitted limit of the conditions required for assessment instruments, the assessment instruments developed for scientific creativity in this study can be said to be very favorable.

A Study on Validation by the Development of a Science Process Skills Test with Science Experiments (과학 실험 평가 도구 개발을 통한 탐구 능력 평가의 타당화에 관한 연구)

  • Woo, Jong-Ok;Lee, Hang-Ro;Kim, Seung-Hun
    • Journal of The Korean Association For Science Education
    • /
    • v.17 no.1
    • /
    • pp.65-73
    • /
    • 1997
  • The purpose of this study is to develop a valid and reliable instrument, applicable to high school Earth Science class experiment. In advance of developing items, I was selected 14 inquiry process skills and specified evaluative objectives for each of them to develop scales and criteria for them. I developed 28 evaluation items for 5 experiment subjects among those of high school Earth Science class. The first field trial was performed a sample of 5 high school students, and the second one using a sample of 25 high school students. The results are as follows. (1) The content validity and reliability(Cronbach $\alpha$) of the developed items were 82.7% and .86, respectively, the developed instrument in this study is considered valid and reliable. (2) The average difficulty index was .69 and the discrimination index was .30. (3) Answer sheets based on the reported results were rated 5 teachers and Inter-rater Reliabilitiy and Inter-rater Consistency were analyzed, its indices were .80 and .76, respectively. (4) The developed items show a low coefficient of .45 with TESIS, a set of paper-and-pencil test items developed by Lee, Hang-Ro(1991). That the experiment assessment is solely subject to the rater's viewpoint has been one of the major problems raised concerning the matter. This research, however, shows that a set of more specified scales and criteria for the evaluation will make it more valid, reliable and efficient.

  • PDF

RELIABILITY and VALIDITY of DUAL PROBE-FIXING FRAME for REHABILITATIVE ULTRASOUND IMAGING for EXERCISES with VISUAL FEEDBACK

  • Na-eun Byeon;Jang-hoon Shin;Wan-hee Lee
    • Physical Therapy Rehabilitation Science
    • /
    • v.12 no.3
    • /
    • pp.259-267
    • /
    • 2023
  • Objective: Rehabilitative ultrasound imaging is a safe and noninvasive technique for evaluating muscle thickness. A dual probe-fixing frame (DPF) can provide visual feedback during exercises targeting specific muscles. The purpose of this research was to verify the reliability and validity of the DPF for dual-probe ultrasound (DPU)-based visual feedback exercises, allowing users to use both hands freely. Design: This cross-sectional study used repeated measures to compare muscle thickness measurements obtained using the handheld device and DPF with DPU. Methods: Twenty healthy adults participated in the study. Measurements were taken over two sessions, with a two-day interval between the sessions. The thicknesses of the rectus abdominis (RA) and transverse abdominis (TrA) muscles were measured using DPU. The DPF with DPU developed by the research team, was used along with a laptop-based muscle viewer. Bland-Altman analysis and intraclass correlation coefficients (ICCs) calculations were used in statistical analyses to evaluate agreement and reliability, respectively. Results: The results of the Bland-Altman analysis showed small average differences between the handheld and DPF methods for both RA and TrA muscle thicknesses. Inter-rater reliability analysis showed high ICC values for DPF measurements of both RA (0.908-0.912) and TrA (0.892-741) muscle thicknesses. Intra-rater reliability analysis also showed good ICC values for measurements taken by a single examiner over two days. Conclusion: The findings of this study demonstrate that the DPF provides reliable and valid measurements of muscle thickness during visual feedback exercises using the DPU.

Comparison of Femoral Anteversion Angle and Determination of Reliability Measured at Three Different Anatomical References of the Tibial Crest During the Trochanteric Prominence Angle Test

  • Lee, Ji-Hyun;Yoon, Tae-Lim;Choi, Sil-Ah;Cynn, Heon-Seock
    • Physical Therapy Korea
    • /
    • v.19 no.4
    • /
    • pp.55-60
    • /
    • 2012
  • The trochanteric prominence angle test (TPAT) has been used to measure the femoral anteversion angle between the tibial crest and the vertical line. However, the exact anatomical reference of the tibial crest has not yet been identified in the literature. Thus, the purposes of this research were twofold: first, to compare the femoral anteversion angle measured at three different anatomical references of the tibial crest (the proximal tibial crest, the proximal third of tibial crest, and the proximal half of tibial crest) and, second, to determine inter-and intra-rater reliabilities of the femoral anteversion angle measured at these three different anatomical references of the tibial crest during the TPAT. We recruited 14 healthy subjects, and a total of 28 legs were examined. The TPAT was measured using a digital inclinometer. A 1-way repeated-measure analysis of variance was used to compare the femoral anteversion angle measured at three different anatomical references of the tibial crest, and intraclass correlation coefficients (ICCs) were calculated to determine reliability. The femoral anteversion angle measured at the proximal tibial crest was significantly higher than that at the proximal third of the tibial crest and the proximal half of the tibial crest. The inter-and intra-rater reliabilities of femoral anteversion angle were measured at three anatomic references of the tibial crest were all found to be high during the TPAT (ICC=.9 0~.98). In conclusion, clinicians should recognize that the different degrees of the femoral anteversion angle could be measured when different anatomical references of the tibial crest were used, and that reliabilities were high when an exact anatomical reference of the tibial crest was used during the TPAT.

A Feasibility Study on Adopting Individual Information Cognitive Processing as Criteria of Categorization on Apple iTunes Store

  • Zhang, Chao;Wan, Lili
    • The Journal of Information Systems
    • /
    • v.27 no.2
    • /
    • pp.1-28
    • /
    • 2018
  • Purpose More than 7.6 million mobile apps could be approved on both Apple iTunes Store and Google Play. For managing those existed Apps, Apple Inc. established twenty-four primary categories, as well as Google Play had thirty-three primary categories. However, all of their categorizations have appeared more and more problems in managing and classifying numerous apps, such as app miscategorized, cross-attribution problems, lack of categorization keywords index, etc. The purpose of this study focused on introducing individual information cognitive processing as the classification criteria to update the current categorization on Apple iTunes Store. Meanwhile, we tried to observe the effectiveness of the new criteria from a classification process on Apple iTunes Store. Design/Methodology/Approach A research approach with four research stages were performed and a series of mixed methods was developed to identify the feasibility of adopting individual information cognitive processing as categorization criteria. By using machine-learning techniques with Term Frequency-Inverse Document Frequency and Singular Value Decomposition, keyword lists were extracted. By using the prior research results related to car app's categorization, we developed individual information cognitive processing. Further keywords extracting process from the extracted keyword lists was performed. Findings By TF-IDF and SVD, keyword lists from more than five thousand apps were extracted. Furthermore, we developed individual information cognitive processing that included a categorization teaching process and learning process. Three top three keywords for each category were extracted. By comparing the extracted results with prior studies, the inter-rater reliability for two different methods shows significant reliable, which proved the individual information cognitive processing to be reliable as criteria of categorization on Apple iTunes Store. The updating suggestions for Apple iTunes Store were discussed in this paper and the results of this paper may be useful for app store hosts to improve the current categorizations on app stores as well as increasing the efficiency of app discovering and locating process for both app developers and users.

Evaluating the Validity and Reliability of the Korean Version of Upper Extremity Performance Test for the Elderly (TEMPA) (한국판 TEMPA의 신뢰도 및 타당도 연구)

  • Lee, Chang-Dae;Jung, Min-Ye;Park, Ji-Hyuk;Kim, Jongbae
    • Therapeutic Science for Rehabilitation
    • /
    • v.8 no.4
    • /
    • pp.65-76
    • /
    • 2019
  • Objective : This study aimed to verify the validity and reliability of the Upper Extremity Performance Test for the Elderly (TEMPA) by modifying its items to exhibit cultural differences. Methods : This study included 171 healthy adults and older adults and 41 individuals with impaired upper extremity function. Content validity, discriminant validity, test-retest reliability, and inter-rater reliability were analyzed. Results : The following items, exhibiting cultural differences, were modified: "open a lock and take the top off a pillbox" and "write and affix a postage stamp." The discriminant validity results indicated that participants with normal upper extremity function performed better than those with impaired in the upper extremity function (p<.001). The test-retest reliability of the execution speed (intraclass correlation coefficient; ICC) was .71-.94, functional rating (kappa) was 1.0, and task analysis (ICC) was 1.0. The inter-rater reliability of the speed of execution was 1.0, functional rating was .79-1.0, and task analysis was .94-1.0. Conclusion : TEMPA has moderate to high level of reliability and is an assessment tool that can clearly distinguish individuals with upper extremity impairment from those without impairment.

Development of Comprehensive Oro-Facial Function Scale (포괄적 구강안면기능척도(Comprehensive Oro-Facial Function Scale; COFFS)의 개발)

  • Son, Yeong Soo;Min, Kyoung Chul;Woo, Hee-Soon
    • Therapeutic Science for Rehabilitation
    • /
    • v.11 no.1
    • /
    • pp.69-85
    • /
    • 2022
  • Objective : This study aimed to develop a Comprehensive Oro-Facial Function Scale (COFFS) that can evaluate oro-facial function in patients with dysphagia. Methods : To verify the item composition and reliability of the COFFS, preliminary items were collected by selecting and analyzing four previous studies, and the Content Validity Ratio (CVR) was derived through a second survey of experts. Cronbach's 𝛼 was calculated for the internal validity of the evaluation items, and the test-retest reliability and inter-rater reliability were calculated using the internal classification coefficients (ICC). Results : The content validity ratio of all items was 0.67; in the case of Cronbach's 𝛼 value for each domain, 0.849 for communication domain, -0.224 for the oro-facial structure and shape, 0.831 for the ability to perform orofacial movements, and 0.946 for mastication and swallowing function. The test-retest reliability was 0.974 and the inter-rater reliability was 0.937, showing high reliability. Conclusion : In this study, the evaluation tool of COFFS was finally selected from 34 items in four areas and developed on a 3-5 point scale according to the evaluation items. In future studies, additional research is needed to prove its validity through correlation with other evaluation tools that measure oro-facial function.

The Image of Nursing projected in Newspapers (신문에 나타난 간호의 이미지에 관한 연구)

  • 정면숙;강영실
    • Journal of Korean Academy of Nursing
    • /
    • v.23 no.1
    • /
    • pp.16-28
    • /
    • 1993
  • The purpose of this study was to identify the im-. age of nursing, that is, to see how nursing is viewed in newspapers. Articles about nursing from two Korean daily newspapers from Jan. 1, 1987 to Dec.31, 1991 were examined for subject, type, attitude and author-ship. The inter-rater reliability was 0.89(by The Holsti method). The major findings were as follows : 1. The total number of articles were 110. 2. As for the subjests matter, articles related to professional nursing activities appeared most frequently(29.6%) , there about labor issues and activity to promote nurses's job climate 19.4%, and about official activities of nursing 11.2%. 3. Commentary articles appeared most frequently(41.2%) , Other article forms were straight news(27. 1%), contribution(17.6%) and inter-views (10.6%). 4. Feature stories acounted for 62.4% and news articles for 37.6%. Most of the articles were of national interests(96.5%), the rest(3.5%) of news from abroad. 5. Articles favorable toward nursing accounted for 54.1%, neutral 28.2%, negative 17.6%. 6. Many articles were written by the reporters (66.3%).

  • PDF

The reliability of the nonradiologic measures of thoracic spine rotation in healthy adults

  • Hwang, Donggi;Lee, Ju Hyeong;Moon, Seongyeon;Park, Soon Woo;Woo, Juha;Kim, Cheong
    • Physical Therapy Rehabilitation Science
    • /
    • v.6 no.2
    • /
    • pp.65-70
    • /
    • 2017
  • Objective: The purpose of this study was to examine the intertester reliability and validity of four nonradiologic measurements of thoracic spine rotation in healthy adults. Design: Descriptive laboratory study. Methods: This study was conducted on 20 male and 20 female university students aged between 19 and 26. To measure thoracic rotation, a goniometer, a bubble inclinometer, a dual inclinometer, and a smartphone application-clinometer were used. The measurement was performed twice for each device and the same measurement was performed by two examiners. The measurements were performed in the lumbar locked position. The arm in the direction of rotation was taken back and placed onto the back of the lumbar region. With right and left trunk rotation, the head was rotated together but remained in the center line so that the axial rotation was maintained. Both examiners performed the measuring procedures and directly handled the measuring instrument. All measurement results were recorded by the recorder. Results: The range of motion (ROM) of thoracic rotation in lumbar locked position for all four devices was 47 degrees. The intra-rater reliability estimates ranged from 0.738 to 0.906 (p<0.05). The inter-rater reliability estimates ranged from 0.736 to 0.853 (p<0.05). The goniometer, bubble inclinometer, dual inclinometer, and smartphone clinometer showed high validity (p<0.05). This result indicates that all four devices may be used by the same examiner and by other examiners obtaining follow-up measurement. Conclusions: The use of the goniometer, bubble inclinometer, dual inclinometer, and smartphone clinometer for measurements in the lumbar locked posture are reliable and valid nonradiologic measures of thoracic rotational ROM in healthy adults.

A Pilot Study of Evaluating the Reliability and Validity of Pattern Identification Tool for Insomnia and Analyzing Correlation with Psychological Tests (불면증 변증도구 신뢰도와 타당도 평가 및 심리검사와의 상관성에 대한 초기연구)

  • Jeong, Jin-Hyung;Lee, Ji-Yoon;Kim, Ju-Yeon;Kim, Si-Yeon;Kang, Wee-Chang;Lim, Jung Hwa;Kim, Bo Kyung;Jung, In Chul
    • Journal of Oriental Neuropsychiatry
    • /
    • v.31 no.1
    • /
    • pp.1-12
    • /
    • 2020
  • Objectives: The purpose of this study was to evaluate the reliability and validity of the instrument on pattern identification for insomnia (PIT-Insomnia) and verify the correlation between PIT-Insomnia and psychological tests. Methods: Two evaluators examined the pattern identification of the participants who met insomnia disorder diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorder, Fifth Edition (DSM-5) and took the Insomnia Severity Index (ISI) score over 15 once manually and twice using the PIT-Insomnia to measure the inter-rater and test-retest reliability. We also conducted the following surveys: the Pittsburgh Sleep Quality Index (PSQI), the Korean version of Beck's depression inventory (K-BDI), the Korean version of the State-Trait Anxiety Inventory (STAI-K), the Korean Symptom checklist-95 (KSCL-95), and the EuroQol-5 dimension (EQ-5D), to measure concurrent validity and correlation between the PTI-Insomnia and psychological tests. Results: 1. The test-retest reliability analysis of the pattern identification results showed moderate agreement, and test-retest reliability analysis of each pattern identification score showed agreements from poor to moderate. 2. The inter-rater reliability analysis of the pattern identification results via manual showed slight agreement, when analysis was performed with calibration, the inter-rater reliability analysis of the pattern identification results via manual showed fair agreement. 3. The concordance analysis between results via manual and the PIT-Insomnia showed poor agreement, when the analysis was performed with calibration, concordance analysis showed fair agreement. 4. The concordance analysis between the PIT-Insomnia and the PSQI showed positive linear correlation. 5. The concordance analysis between the PIT-Insomnia and the PSQI, K-BDI, STAI-K, KSCL-95, and EQ-5D showed that non-interaction between the heart and kidney have positive linear correlation with the K-BDI, anxiety item of KSCL-95, dual deficiency of the heart-spleen have positive linear correlation with somatization item of KSCL-95, paranoia item of KSCL-95, heart deficiency with timidity have positive linear correlation with stress vulnerability item of KSCL-95, parania item of KSCL-95, phlegm-fire harassing the heart have positive linear correlation with K-BDI, paranoia item of KSCL-95, depressed liver qi transforming into fire have positive linear correlation with the anxiety item of KSCL-95, parania item of KSCL-95, all pattern identification have negative linear correlation with EQ-5D. Conclusions: The PIT-Insomnia has moderate agreement of reliability and reflects the severity of insomnia since it has some concurrent validity with the PSQI. There are some correlations between the PTI-Insomnia with specific psychological tests, so we could suggest it can be used appropriately in the clinical situation.