• Title/Summary/Keyword: content-generalizability

Search Result 12, Processing Time 0.032 seconds

Analysis of Korea Earth Science Olympiad Items for the Enhancement of Item Quality (한국 지구과학 올림피아드 문항 분석을 통한 문항의 질 향상 방안)

  • Lee Ki-Young;Kim Chan-Jong
    • Journal of the Korean earth science society
    • /
    • v.26 no.6
    • /
    • pp.511-523
    • /
    • 2005
  • The purpose of this study is to analyze the 1st and 2nd Korea Earth Science Olympiad (KESO) items, in order to find informations to enhance item quality. To do this, internal and external item classification frameworks are developed. Item difficulty (P), discrimination index (DI), correlation, and reliability are estimated by using classical test theory. Generalizability is also estimated by applying the generalizability theory. The results of item classification are as follows: (1) ‘Geology’, ‘astronomy’ and ‘data analysis and interpretation’ are dominant in content and inquiry process domain, respectively. Nearly every item has textbook context. (2) There is no difference between the preliminary and final tests in terms of their thinking skills sections. (3) As a whole, the ratio of items with pictures is high in item representation. However, multiple-choice and short answer items are more common in preliminary competition, and essay type items are found more often in final competition. The ratio of simple items is high in middle school section and preliminary competition, but composite items are dominant in high school section and final competition. The findings of item analysis are as follows: (1) In the middle school section, P is low and DI is moderate. But in the high school section, there is a considerable differences between science high schools and other high schools in general. (2) The highest correlation is reported between the scores of meteorology domain and total score in middle school, whereas in high school astronomy domain and total score show the highest correlation. (3) General high school section show the highest Cronbach $\alpha$ and generalizability. (4) General high school section show acceptable generalizability coefficient (> 0.80), but middle and science high school section should increase the number of items to reach acceptable generalizability level.

An Application of Generalizability Theory to Self-introduction Letter and Teacher's Recommendation Letter Used in Identification of Mathematical Gifted Students by Observations and Nominations (관찰.추천에 의한 수학영재 선발 시 사용되는 자기소개서와 교사추천서 평가에 대한 일반화가능도 이론의 활용)

  • Kim, Sung-Chan;Kim, Sung-Yeun;Han, Ki-Soon
    • Communications of Mathematical Education
    • /
    • v.26 no.3
    • /
    • pp.251-271
    • /
    • 2012
  • The purpose of this study is: 1) to determine error sources and the effects of each error source, 2) to investigate optimal measuring conditions from holistic and analytic scoring methods, and 3) to compare the value of reliability between Cronbach's alpha and the generalizability coefficient in self-introduction letter and teacher's recommendation letter based on the generalizability theory in identification of mathematical gifted students by observations and nominations. Data of this study were collected from the science education institute for the gifted attached to the university located within in a capital city for the 2011 academic year. Scores form two raters using holistic and analytic scoring methods in both assessment types were used. The results of this study were as follows. First, as to both assessment types, error sources for people were relatively large regardless of scoring methods. However, error sources for raters in holistic scoring methods had a more significant impact than those of analytic scoring methods. Second, to set optimal measuring conditions in the self-introduction letter and teacher's recommendation letter, if we fixed the number of raters into 2 based on holistic scoring methods, at least 5 and 10 content domains were needed, respectively. In addition, the number of items in teacher's recommendation letter should be more than 3 when we fixed the number of content domains into 4, and the number of items in self-introduction letter should be more than 8 when we fixed the number of content domains into 6 using analytic scoring methods. Third, Cronbach's alpha having only a single source of errors was higher than the generalizability coefficient regardless of assessment types and scoring methods. Hence we recommend that generalizability coefficient based on various error sources such as raters, content domains, and items should be considered to keep a satisfactory level of reliability in both assessment types.

An Application of Multivariate Generalizability Theory to Teacher Recommendation Letters and Self-introduction Letters Used in Selection of Mathematically Gifted Students by Observation and Nomination (관찰·추천제에 의한 수학영재 선발 시 사용되는 교사추천서와 자기소개서 평가에 대한 다변량 일반화가능도 이론의 활용)

  • Kim, Sung Yeun;Han, Ki Soon
    • Journal of Gifted/Talented Education
    • /
    • v.23 no.5
    • /
    • pp.671-695
    • /
    • 2013
  • This study provides an illustrative example of using the multivariate generalizability theory. Specifically, it investigates relative effects of each error source, and finds optimal measurement conditions for the number of items within each content domain that maximizes the reliability-like coefficients, such as a generalizability coefficient and an index of dependability. The method is based on teacher recommendation letters and self-introduction letters, using an analytic scoring method in the context of selection of mathematically gifted students by observation and nomination. This study analyzed data from the 2011 academic year in the science education institute for the gifted, which is attached to the university located in the Seoul metropolitan area. It should be noted that the optimal scoring structures of this study are not generalizable to other selection instruments. However, the methodology applied in this study can be utilized to find optimal measurement conditions for the number of raters, the number of content domains, and the number of items in other selection instruments self-developed by many institutions including: the education institutes for the gifted at provincial offices of education, gifted classes, and the science education institutes for the gifted attached to universities in general. In addition, the methodology will provide bases for making informed decisions in selection instruments of the gifted based on measurement traits.

Analysis of Assessment Types, Scoring Methods and Reliability of Science Performance Assessment in Middle and High School (중등학교 과학 수행평가의 평가 유형과 채점 방식 및 신뢰도 분석)

  • Lee, Ki-Young;An, Hui-Soo
    • Journal of The Korean Association For Science Education
    • /
    • v.25 no.2
    • /
    • pp.173-183
    • /
    • 2005
  • In this study, we questioned what assessment types and scoring methods of science performance assessment(SPA) were being used in middle and high school, and how much these SPA scores were reliable(generalizable). To answer these questions, SPA data obtained from the seven schools were classified according to assessment type and scoring method. Based upon this classification, we analyzed the reliability by applying generalizability theory. The result, from the classification of assessment type and scoring method, showed that SPA types of the seven schools were divided into two types: paper-pencil type and task type. Paper-pencil type included answer(content)-restricted essay-type test solely. Task type has two parts: process and outcome assessment. As the results of analyzing scoring methods of the seven schools, there were two cases in the way of scoring methods: one case is scoring all essay-type items and performance tasks by one teacher, the other is scoring assigned performance tasks by two teachers. But the case of scoring assigned essay-type items or the case of cross scoring by two or more teachers were not found. The findings of the reliability analysis are as follows: (1) Effect of essay-type item to SPA score was larger than that of performance task. (2) There was remarkable difference among the seven schools' interaction effect of person and rater in scoring performance tasks. (3) Most of generalizability(reliability) coefficients of SPA for the seven schools were smaller than the acceptable generalizability coefficient(0.80). Therefore, the population of statistical parameters such as number of item, task and rater, should be increased for approaching the acceptable generalizability level.

A Comparative Study of a New Approach to Keyword Analysis: Focusing on NBC (키워드 분석에 대한 최신 접근법 비교 연구: 성경 코퍼스를 중심으로)

  • Ha, Myoungho
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.33-39
    • /
    • 2021
  • This paper aims to analyze lexical properties of keyword lists extracted from NLT Old Testament Corpus(NOTC), NLT New Testament Corpus(NNTC), and The NLT Bible Corpus(NBC) and identify that text dispersion keyness is more effective than corpus frequency keyness. For this purpose, NOTC including around 570,000 running words and NNTC about 200,000 were compiled after downloading the files from NLT website of Bible Hub. Scott's (2020) WordSmith 8.0 was utilized to extract keyword lists through comparing a target corpus and a reference corpus. The result demonstrated that text dispersion keyness showed lexical properties of keyword lists better than corpus frequency keyness and that the former was a superior measure for generating optimal keyword lists to fully meet content-generalizability and content distinctiveness.

Exploring the Reliability of an Assessment based on Automatic Item Generation Using the Multivariate Generalizability Theory (다변량일반화가능도 이론을 적용한 자동문항생성 기반 평가에서의 신뢰도 탐색)

  • Jinmin Chung;Sungyeun Kim
    • Journal of Science Education
    • /
    • v.47 no.2
    • /
    • pp.211-224
    • /
    • 2023
  • The purpose of this study is to suggest how to investigate the reliability of the assessment, which consists of items generated by automatic item generation using empirical example data. To achieve this, we analyzed the illustrative assessment data by applying the multivariate generalizability theory, which can reflect the design of responding to different items for each student and multiple error sources in the assessment score. The result of the G-study showed that, in most designs, the student effect corresponding to the true score of the classical test theory was relatively large after residual effects. In addition, in the design where the content domain was fixed, the ranking of students did not change depending on the item types or items. Similarly, in the design where the item format was fixed, the difficulty showed little variation depending on the content domains. The result of the D-study indicated that the original assessment data achieved a sufficient level of reliability. It was also found that higher reliability than the original assessment data could be obtained by reducing the number of items in the content domains of operation, geometry, and probability and statistics, or by assigning higher weights to the domains of letters and formulas, and function. The efficient measurement conditions presented in this study are limited to the illustrative assessment data. However, the method applied in this study can be utilized to determine the reliability and to find efficient measurement conditions for the various assessment situations using automatic item generation based on measurement traits.

Development and Validation of Life Safety Awareness Scale of High School Students and Analysis of Interindividual Differences

  • Lee, Soon-Beom;Kim, Eun-Mi;Kong, Ha-Sung
    • International journal of advanced smart convergence
    • /
    • v.11 no.4
    • /
    • pp.104-119
    • /
    • 2022
  • Life safety awareness level diagnosis is necessary for customized safety education and continuous safety awareness. As the starting stage of safety education for each life cycle, a scale that has verified the reliability and validity of high school students' life safety awareness has not yet been developed. In this context, the purpose of this study is to develop and validate the life safety awareness scale of high school students and to analyze interindividual differences. Questionnaire data was collected from April to June 2022 from 834 students in the first, second, and third grades of high schools in △△ city in Jeollabuk-do. A final 25-item scale was developed using the preliminary survey, preliminary test, the main test, descriptive statistical analysis, and exploratory and confirmatory factor analysis. This scale consists of four sub-factors: 'safety prevention', 'safety knowledge', 'safety preparation', and 'safety protection'. Good reliability and validity were verified by analysis of content validity and construct validity. The generalizability of the scale was verified by crossover validation between the search group and the crossover group. Based on the interindividual differences analysis, although there was a difference between genders in life safety awareness, there was no difference by grade level and academic achievement. This study is significant in developing the first valid scale that can measure high school students' life safety awareness and providing the necessity and rationale for life safety education by life cycle considering individual gender differences.

A study on the manager장s jon satifaction in franchise restaurant. (프랜차이즈 레스토랑 점장의 직무만족에 관한 연구)

  • 박대섭
    • Culinary science and hospitality research
    • /
    • v.6 no.1
    • /
    • pp.225-252
    • /
    • 2000
  • This study aims to examine theoretical frame work of franchise restaurant, the characteristics of store manager's job and the level of their job satisfaction through an empirical investigation. Job satisfaction survey study shows that store managers consider important all work to be attended to as part of their duty with service management on top. It is also found that the majority of store managers consider their aptitude as most important job satisfaction factor and those, who are satisfied with their job content, advancement and the prospect, are more proactive in delivering qualify service and more than willing to commit themselves to their duties. Regrading demographical variables, store managers with scholarly competence and higher pay level are more likely to be satisfied with their job but married men are not satisfied with the work environment in general. Ergo, Businesses should correspond by capitalizing on those store managers content with their duty thus collecting additional information and providing opportunities to further contribute to the business. For those dissatisfied individuals, however, businesses should determine their demands and by educational training supply a motive therefore making possible the conversion of such individuals to satisfied store managers and their active participation in business management. But, as with any study, this one has a number of limitation which constraints the generalizability of the empirical findings. It has not been for long since franchise restaurants established in domestic market and has been few studies regarding this topic there. Furthermore, managers are not willing to release operation related data. Therefore, further study are urged to overcome this limitation and should examine other dimensions of job satisfaction such as relations between revenue and profit with the level of store manager's job satisfaction remain to be investigated.

  • PDF

Assessing the Validity of the Preclinical Objective Structured Clinical Examination Using Messick's Validity Framework (Messick의 타당도 틀을 활용한 임상실습 전 실기시험의 타당도 평가)

  • Lee, Hye-Yoon;Yune, So-Jung;Lee, Sang-Yeoup;Im, Sunju
    • Korean Medical Education Review
    • /
    • v.23 no.3
    • /
    • pp.185-193
    • /
    • 2021
  • Students must be familiar with clinical skills before starting clinical practice to ensure patients' safety and enable efficient learning. However, performance is mainly tested in the third or fourth years of medical school, and studies using the validity framework have not been reported in Korea. We analyzed the validity of a performance test conducted among second-year students classified into content, response process, internal structure, relationships with other variables, and consequences according to Messick's framework. As results of the analysis, content validity was secured by developing cases according to a pre-determined blueprint. The quality of the response process was controlled by training and calibrating raters. The internal structure showed that (1) reliability by generalizability theory was acceptable (coefficients of 0.724 and 0.786, respectively, for day 1 and day 2), and (2) the relevant domains had proper correlations, while the clinical performance examination (CPX) and objective structured clinical examination (OSCE) showed weaker relationships. OSCE/CPX scores were correlated with other variables, especially grade point average and oral structured exam scores. The consequences of this assessment were (1) making students learn clinical skills and study themselves, while causing too much stress for students due to lack of motivation; (2) reminding educators of the need to apply practical teaching methods and to give feedback on the test results; and (3) providing an opportunity for faculty to consider developing support programs. It is necessary to develop the blueprint more precisely according to students' level and to verify the validity of the response process with statistical methods.

The Development and Validation of Instrument for Measuring High School Students' STEM Career Motivation (고등학생들을 위한 이공계 진로동기 검사도구 개발 및 타당화)

  • Shin, Sein;Ha, Minsu;Lee, Jun-Ki
    • Journal of The Korean Association For Science Education
    • /
    • v.36 no.1
    • /
    • pp.75-86
    • /
    • 2016
  • The purpose of the present study is to develop and validate an instrument to assess STEM career motivation. We developed 32 items for 7 constructs (i.e. education experience, career value, academic self-efficacy, career self-efficacy, career interest, parents' support, and career motivation) on STEM career motivation based on Social Cognitive Career Theory (SCCT; Lent et al.,1994). 767 first year high school students participated in this study. The items were validated by Messick's framework (1995). In this study, we examined the validity of items in four aspects (i.e. content, substantive, structural and generalizability of validity). Methodologically, we used Rasch analysis, Exploratory factor analysis, confirmative factor analysis based on structural equation modelling. We confirmed that our instrument with 32 items as valid and reliable for measuring the STEM career motivation. In addition, we tested the STEM career motivation model based on SCCT. Our model explained the data well, suggesting that external factors (education experience and parents' support) and cognitive factors (perception of value, self-efficacy and interest) were significantly related to STEM career motivation.