• 제목/요약/키워드: Item response theory

검색결과 95건 처리시간 0.026초

국제 수학·과학 성취도 추이 연구 분석을 위한 국가 수준 진단평가 모형 탐색 (An Investigation of a Country-Level Diagnostic Assessment Model for the TIMSS)

  • 박찬호
    • 비교교육연구
    • /
    • 제28권5호
    • /
    • pp.1-19
    • /
    • 2018
  • 국제 수학 과학 성취도 추이 연구와 같은 교육평가의 목적은 국가와 같은 집단을 비교하는 것이다. 이와 같이 평가의 단위가 학생보다 상위 수준일 때 집단을 비교하기 위해서는 다층 문항반응이론 기반의 집단 수준 진단평가 모형을 고려할 수 있다. 이 연구에서는 Park과 Bolt(2008)가 제안한 문항특성모형을 수정하여 새로운 다층 문항반응이론 기반의 집단 수준 진단평가 모형을 제안하였다. 이 수정된 모형은 2015년에 실시된 국제 수학 과학 성취도 추이 연구의 8학년 수학 평가 자료에 적용되었다. 그 결과 9개의 인지적 요소에 대한 국가들의 프로파일을 구할 수 있었다. 이 연구 결과는 국가 간 비교 또는 요소 간 상호 비교에 이용될 수 있다. 이와 같이 잔차항을 설명할 수 있도록 수정된 모형은 보다 높은 신뢰도를 지니며 국가 간 비교에 보다 정확한 정보를 제공할 수 있을 것으로 기대된다. 예를 이용하여 연구 결과를 어떻게 해석할 수 있는지 제시하였으며, 연구의 제한점과 후속 연구에 대해서도 논의하였다.

문항 반응 이론에 의한 학습자 평가 시스템 설계 및 구현 (Design and Implementation of a Learner Testing System with Item Response Theory)

  • 송은하;박복자;하태령;정영식
    • 컴퓨터교육학회논문지
    • /
    • 제6권2호
    • /
    • pp.1-8
    • /
    • 2003
  • 기존의 학습자 평가 시스템은 교수자의 주관적인 관점과 견해에 의해 각 문항의 난이도가 결정되는 단점을 가지고 있다. 본 논문에서는 학습자의 개별 능력 평가가 가능하고 개인별 학습 수준에 적합한 문항을 난이도, 변별도 및 추측도를 이용하여 학습자에게 제공함으로써 개인별 문항 평가가 가능한 학습자 평가 시스템을 개발하고자 한다. 본 연구의 학습자 평가 시스템은 CAT 기법의 문항 반응 이론 중 3-모수 로지스틱 모델을 이용하여 개발한다.

  • PDF

A Comparative Study of Oswestry Back Pain Disability Questionnaire Versus Computer Adaptive Testing for Measuring Back Pain

  • Choi, Bong-Sam
    • 한국전문물리치료학회지
    • /
    • 제20권4호
    • /
    • pp.22-31
    • /
    • 2013
  • The aim of the present study was to compare measurement precisions of the Oswestry Back Pain Disability Questionnaire (ODQ) and a computer adaptive testing (CAT) method. The ODQ has been regarded as one of the most reliable condition-specific measure for back pain for decades. Cross-sectional study was carried out with two independent convenient samples from two out-patient rehabilitation clinics for back pain ($n_1=42$) and non-back pain group ($n_2=42$). Participants were asked to fill out the ODQ and CAT of International Classification of Functioning, Disability and Health-Activity Measure (ICF-AM). A series of Rasch analyses were performed to calculate person ability measures. The CAT measures had greater relative precision in discriminating the groups than did the ODQ measure in comparisons of the relative precision. The CAT measure appears to be more effective than did the ODQ measure in terms of measurement precision. By administering test items calibrated in a way, CAT measures using item response theory may promise a means with measurement precision as well as efficiency.

다차원 문항반응이론에 기반한 문항 응답 데이터 생성 알고리즘 (Algorithm Generating Item Response Data Based on Multidimensional Item Response Theory)

  • 김병욱;이원규
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2014년도 춘계학술발표대회
    • /
    • pp.526-528
    • /
    • 2014
  • 본 논문은 다차원 문항반응이론 모델에 기반하여 시뮬레이션을 위한 피험자들의 문항 응답 데이터를 생성하는 알고리즘을 개발하는 것이 목적으로 하였다. 본 알고리즘은 시험지를 구성하고 있는 문항들의 모수를 읽고, 각각의 차원에 대해 피험자들의 능력 수준을 나타내는 정규 분포 확률 변수를 생성한다. 본 알고리즘은 다차원 문항반응이론 모델에 기반하여 피험자들이 각 문항에 대해 정답으로 응답할 확률을 계산한다. 피험자들의 문항 응답을 결정하는 균일 분포 난수와 비교한다. 만약 확률이 난수보다 크면 피험자는 올바른 답을 한 것으로 보고 그렇지 않을 경우 틀리게 답할 것으로 한다. 본 프로그램은 피험자 수, 문항 수를 조절할 수 있다. 본 알고리즘을 통해 교육 측정 분야에서 다차원 문항반응 이론을 이용하여 학습자들의 문항 응답 데이터를 이용한 시뮬레이션 연구에 기여할 수 있을 것으로 기대한다.

2011-2016년 의학교육평가컨소시엄 임상종합평가의 효과성 (Effectiveness of Medical Education Assessment Consortium Clinical Knowledge Mock Examination (2011-2016))

  • 이상엽;이예리;김미경
    • 의학교육논단
    • /
    • 제20권1호
    • /
    • pp.20-31
    • /
    • 2018
  • Good assessment is crucial for feedback on curriculum and to motivate students to learn. This study was conducted to perform item analysis on the Medical Education Assessment Consortium clinical knowledge mock examination (MEAC CKME) (2011-2016) and to evaluate several effects to improve item quality using both classical test theory and item response theory. The estimated difficulty index (P) and discrimination index (D) were calculated according to each course, item type, A (single best answer)/R (extended matching) type, and grading of item quality. The cut-off values used to evaluate P were: >0.8 (easy); 0.6-0.8 (moderate); and <0.6 (difficult). The cut-off value for D was 0.3. The proportion of appropriate items was defined as those with P between 0.25-0.75 and D ${\geq}0.25$. Cronbach ${\alpha}$ was used to assess the reliability and was compared with those of the Korean Medical Licensing Examination (KMLE). The results showed the recent mean difficulty and decimation index was 0.62 and 0.20 for the first MEAC CKME and 0.71 and 0.19 for the second MEAC CKME, respectively. Higher grade items evaluated by a self-checklist system had better D values than lower grade items and higher grade items gradually increased. The preview and editing process by experts revealed maintained P, decreased recall items, increased appropriate items with better D values, and higher reliability. In conclusion, the MEAC CKME (2011-2016) is deemed appropriate as an assessment to evaluate students' competence and prepare year four medical students for the KMLE. In addition, the self-checklist system for writing good items was useful in improving item quality.

의학교육에서 컴퓨터바탕검사와 문항은행 데이터베이스 구축 (Computer-Based Testing and Construction of an Item Bank Database for Medical Education in Korea)

  • 허선
    • 의학교육논단
    • /
    • 제16권1호
    • /
    • pp.11-15
    • /
    • 2014
  • A number of medical schools in Korea have been using computer-based testing (CBT) for evaluating their students' scientific and/or clinical performance since the early 1990s. Introducing CBT to medical education would have several advantages: first, presenting figures and audio-video files of clinical content is simple with CBT, making it possible to evaluate medical students' competency with navigating more realistic clinical situations at minimum cost; second, CBT enables automatic item analysis and score reporting. To establish CBT, constructing an item bank with item parameters such as difficulty or discriminating parameters will be needed. To select more psychometrically sound items, analysis of the items according to item response theory is necessary. CBT has already been introduced in high stakes tests like the United States Medical Licensing Examination and the Medical Council of Canada Qualifying Examination. The National Health Personnel Examination Board in Korea is also planning to introduce a CBT-based version of the National Medical Examination soon. Thus all medical schools in Korea will need to introduce CBT and construct item banks to prepare their students for their licensing examinations and to measure the students' competency more accurately.

A Construction Method for Personalized e-Learning System Using Dynamic Estimations of Item Parameters and Examinees' Abilities

  • Oh, Yong-Sun
    • International Journal of Contents
    • /
    • 제4권2호
    • /
    • pp.19-23
    • /
    • 2008
  • This paper presents a novel method to construct a personalized e-Learning system based on dynamic estimations of item parameters and learners' abilities, where the learning content objects are of the same intrinsic quality or homogeneously distributed and the estimations are carried out using IRT(Item Response Theory). The system dynamically connects the test and the corresponding learning procedures. Test results are directly applied to estimate examinee's ability and are used to modify the item parameters and the difficulties of learning content objects during the learning procedure is being operated. We define the learning unit 'Node' as an amount of learning objects operated so that new parameters can be re-estimated. There are various content objects in a Node and the parameters estimated at the end of current Node are directly applied to the next Node. We offer the most appropriate learning Node for a person's ability throughout the estimation processes of IRT. As a result, this scheme improves learning efficiency in web-base e-Learning environments offering the most appropriate learning objects and items to the individual students according to their estimated abilities. This scheme can be applied to any e-Learning subject having homogeneous learning objects and unidimensional test items. In order to construct the system, we present an operation scenario using the proposed system architecture with the essential databases and agents.

평가 문항을 활용한 중학교 수학 교육과정의 내용 및 인지행동의 위계성 조사 (Investigating the Hierarchical Nature of Content and Cognitive Domains in the Mathematics Curriculum for Korean Middle School Students via Assessment Items)

  • 송미영;김선희
    • 대한수학교육학회지:학교수학
    • /
    • 제9권2호
    • /
    • pp.223-240
    • /
    • 2007
  • 본 연구는 중학생들의 수학 성취를 국가수준에서 평가한 경험적 자료를 활용하여 우리나라 중학교 수학과 교육과정의 내용과 수학에서의 인지행동이 위계적으로 구성되어 있는지를 조사하였다. 전반적으로 교육과정의 내용 제시 순서는 난이도 순위와 통계적으로 유의한 상관관계가 나타나지 않은 반면, 인지행동의 위계는 난이도 순위와 통계적으로 유의한 상관관계가 있었다. 이러한 결과에서 검사 문항의 난이도 순위가 학교에서 배운 수학 교과 내용의 순서보다는 문항에서 요구하는 인지행동의 수준과 더 관련이 있음을 알 수 있었다. 그리고 내용 위계와 인지행동의 위계 간 상관관계가 유의하게 나타나, 교육과정에서 늦게 등장하는 내용일수록 요구되는 인지 행동도 높은 수준임을 발견할 수 있었다. 내용 및 인지행동의 위계와 난이도 순위 간 상관분석에서 특이한 양상을 나타낸 문항에 대해서는 그 특성을 분석하였다.

  • PDF

Predictors of Sun-Protective Practices among Iranian Female College Students: Application of Protection Motivation Theory

  • Dehbari, Samaneh Rooshanpour;Dehdari, Tahereh;Dehdari, Laleh;Mahmoudi, Maryam
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권15호
    • /
    • pp.6477-6480
    • /
    • 2015
  • Purpose: Given the importance of sun protection in the prevention of skin cancer, this study was designed to determine predictors of sun-protective practices among a sample of Iranian female college students based on protection motivation theory (PMT) variables. Materials and Methods: In this cross-sectional study, a total of 201 female college students in Iran University of Medical Sciences were selected. Demographic and PMT variables were assessed with a 67-item questionnaire. Multiple linear regression was used to identify demographic and PMT variables that were associated with sun-protective practices and intention. Results: one percent of participants always wore a hat with a brim, 3.5% gloves and 15.9% sunglasses while outdoors. Only 10.9% regularly had their skin checked by a doctor. Perceived rewards, response efficacy, fear, self-efficacy and marital status were the five variables which could predict 39% variance of participants intention to perform sun-protective practices. Also, intention and response cost explained 31% of the variance of sun-protective practices. Conclusions: These predictive variables may be used to develop theory-based education interventions to prevent skin cancer among college students.

Development of an Instrument based on the Protection Motivation Theory to Measure Factors Influencing Women's Intention to First Pap Test Practice

  • Hassani, Laleh;Dehdari, Tahereh;Hajizadeh, Ebrahim;Shojaeizadeh, Davoud;Abedini, Mehrandokht;Nedjat, Saharnaz
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권3호
    • /
    • pp.1227-1232
    • /
    • 2014
  • Background: Given that there are many Iranian women who have never had a Pap smear, this study was designed to develop and validate a measurement tool based on the Protection Motivation Theory to assess factors influencing the Iranian women's intention to perform first Pap testing. Materials and Methods: In this psychometric research, to determine the Content Validity Index (CVI) and the Content Validity Ratio (CVR), a panel of experts (n=10) reviewed scale items. Reliability was estimated through the Intraclass Correlation Coefficient (n=30) and internal consistency (n=240). Also, factor analysis (exploratory and conformity) was performed on the data of the sample women who had never had a Pap smear test (n=240). Results: A 26-item questionnaire was developed. The CVI and CVR scores of the scale were 0.89 and 0.90, respectively. Exploratory factor analysis loaded a 26-item with seven factors questionnaire (perceived vulnerability and severity, fear, response costs, response efficacy, self-efficacy, and protection motivation (or intention)) that jointly accounted for 72.76% of the observed variance. Confirmatory factor analysis indicated a good fit for the data. Internal consistency (range 0.70-0.93) and test-retest reliability (range 0.72-0.96) of sub-scales were acceptable. Conclusions: This study showed that the designed instrument was a valid and reliable tool for measuring the factors influencing the women's intention to perform their first Pap testing.