• Title/Summary/Keyword: Assessment item

Search Result 705, Processing Time 0.025 seconds

A Method for Developing Items to Assess Earth Science Creativity (지구과학 창의력 평가 문항 개발 방법에 관한 연구)

  • Lee, Hang-Ro
    • Journal of the Korean earth science society
    • /
    • v.24 no.3
    • /
    • pp.150-159
    • /
    • 2003
  • This study suggests methods of assessing scientific creativity and developing items, which can be achieved when both earth science knowledge and general creativity are applied at the same time. According to the results of this study, the cognitive ability gaps between creativity and scientific creativity were clearly defined by the terms' operational definition. Four factors in the Subcategory Of Scientific Creativity-fluency, flexibility, elaboration, and originality-were selected, and the possibility of developing items out of these factors was discovered. The operational definitions of the four factors were given and the criteria for assessment and scoring were set. The validity, reliability, discrimination, and difficulty, which were the conditions required for the assessment instruments, were verified through three field trials of inputting the assessment instruments for scientific creativity. The assessment instruments were composed of 8 items with 2items for each factor. The average item fitness index obtained was 0.99, Cronbach , the item inter-consistency was 0.79,the inter-rater reliability of each item was 0.78, the inter-rater reliability of each factor was 0.75, the item discrimination power was 0.19, and the item difficulty was 0.00. Because the results were within the permitted limit of the conditions required for assessment instruments, the assessment instruments developed for scientific creativity in this study can be said to be very favorable.

Exploring the Reliability of an Assessment based on Automatic Item Generation Using the Multivariate Generalizability Theory (다변량일반화가능도 이론을 적용한 자동문항생성 기반 평가에서의 신뢰도 탐색)

  • Jinmin Chung;Sungyeun Kim
    • Journal of Science Education
    • /
    • v.47 no.2
    • /
    • pp.211-224
    • /
    • 2023
  • The purpose of this study is to suggest how to investigate the reliability of the assessment, which consists of items generated by automatic item generation using empirical example data. To achieve this, we analyzed the illustrative assessment data by applying the multivariate generalizability theory, which can reflect the design of responding to different items for each student and multiple error sources in the assessment score. The result of the G-study showed that, in most designs, the student effect corresponding to the true score of the classical test theory was relatively large after residual effects. In addition, in the design where the content domain was fixed, the ranking of students did not change depending on the item types or items. Similarly, in the design where the item format was fixed, the difficulty showed little variation depending on the content domains. The result of the D-study indicated that the original assessment data achieved a sufficient level of reliability. It was also found that higher reliability than the original assessment data could be obtained by reducing the number of items in the content domains of operation, geometry, and probability and statistics, or by assigning higher weights to the domains of letters and formulas, and function. The efficient measurement conditions presented in this study are limited to the illustrative assessment data. However, the method applied in this study can be utilized to determine the reliability and to find efficient measurement conditions for the various assessment situations using automatic item generation based on measurement traits.

What Pre-service Elementary School Teachers Focus on When Developing Assessment Items: Focusing on the Unit 'Weather and Our Lives' (초등 예비교사가 평가 문항 제작 시 주목하는 것은 무엇인가? : 날씨와 우리 생활 단원을 중심으로)

  • Sung-Man Lim;Seong-Un Kim
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.17 no.2
    • /
    • pp.181-193
    • /
    • 2024
  • Summative assessment provides information on how well students have achieved learning objectives, making the development of high-quality assessment items essential for accurate evaluation. This is one of the competencies that teachers must possess. This study aims to analyze summative assessment items created by pre-service elementary teachers, examining their intentions and the difficulties encountered in the item development process. The study involved 45 second-year students enrolled in an elementary teacher training university. They were grouped into teams of three and tasked with developing ten items, documenting the purpose of each item, the answer key, and the challenges faced during item creation. The collected summative assessment items were analyzed using a two-dimensional purpose classification table that includes Klopfer's taxonomy of educational objectives. The intentions behind the summative assessments and the difficulties faced during item development were inductively organized and analyzed through qualitative data analysis. The results revealed that pre-service elementary teachers adequately reflected scientific content elements but did not evenly cover assessment domains. The most challenging aspect for them was adjusting the difficulty level. Although they considered most factors that should be taken into account during item development, these considerations were not reflected in the actual items. These findings suggest that knowledge and experience are crucial in developing summative assessment items, and systematic lectures are necessary for pre-service elementary teachers.

A Comparison of Free Response Items and Multiple Choice Items in Terms of Effectiveness of Estimating Mathematical Ability (수행형 문항과 선다형 문항의 수학적 능력 추정 효율성 비교)

  • Park, Jung;Park, Kyung-Mi
    • The Mathematical Education
    • /
    • v.43 no.2
    • /
    • pp.151-162
    • /
    • 2004
  • For the past several years, performance assessment has been widely used by mathematics teachers. The superiority of performance assessment items compare to multiple choice items has been discussed by many researchers, however these discussions tend to be lack of empirical data. Thus, this study aims to examine the effectiveness of tree response items in comparison with multiple choice items. Using the information function in Item Response Theory(IRT), item information of free response items and multiple choice items from the Third International Mathematics and Science Study-Repeat(TIMSS-R) were obtained and compared. Test informations of the whole mathematics area as well as each content area of mathematics were computed. On average, tree response items yielded more information than multiple choice items, especially in measurement and data interpretation. This study also revealed that free response items estimated students' mathematics ability more accurately than multiple choice items with smaller number of items.

  • PDF

Issues Related to the Objectivity of Student Assessment in Medical Education (의학교육 학생평가의 객관성에 대한 쟁점)

  • Min, Kyung-Seok;Yang, Kil-Seok
    • Korean Medical Education Review
    • /
    • v.15 no.3
    • /
    • pp.105-111
    • /
    • 2013
  • This paper addressed various issues related to the objectivity of student assessment in medical education. The objectivity of assessment was related to all the steps of test development, administration, and results reporting in terms of reliability and validity. Specifically, the objectivity of item formats, representativeness of test content, standardization of test administration, consistency of scoring procedures, and appropriateness of reporting test results were discussed by comparing performance assessment with traditional paper-and-pencil tests. The conclusions were derived from current measurement theories such as standards-based assessment, evidencebased design, and outcome-based assessment. Further, based on Shepard's propositions (2006), the objectivity of student assessment could be achieved by improving the concordance between educational objectives and assessment components such as item types, test contents, and test administration, scoring, and reporting.

Estimation of Validity for Item Selecting of Landscape Impact Assessment (경관영향평가 항목선정을 위한 타당성 평가)

  • Oh, Myung-Sung;Cho, Hyun-Ju;Lee, Hyun-Taek;Ra, Jung-Hwa
    • Current Research on Agriculture and Life Sciences
    • /
    • v.26
    • /
    • pp.7-15
    • /
    • 2008
  • This research is significant in terms of estimating the validity by setting evaluation items in view of integrating not only original beauty and visual areas but also natural ecological areas based on questionnaire. The results are as follows. 1) According to literature study, 17 items such as variety, the character of sites in terms of landscape, the beauty of landscape, visibility, and ratio of green visibility are selected. Also, 21 items such as variety of animals and plants species, size of green area, and ecological naturalness are selected in the area of landscape ecological resources. 2) As a result of questionnaire of the group of landscape experts, animals and plants ecological areas show 5.6341, the highest in importance analysis according to assessment areas. Also, as a result of importance analysis on items in each area, for example, in the area of visual resources, the item of skyline analysis is 6.0488, the highest. 3) As a result of corelation of item meaning on landscape effect assesment, for example, meaningness of psychological assessment item and landscape site item indicate 0.710, the highest coefficient correlation. 4) As critical assessment items per unit project, items such as damage minimization of original land features for project in terms of point, ratio of green visibility, variety of animals and plants species marked above 8.0 as high important medium. The project in terms of line, minimization of original land features, preservation of worthy biotope showed high point and the character of sites in terms of landscape, minimization of original land features, the size of green area, and analysis of skyline marked above 8.5 as high importance points. On the contrary, items for climate and soil showed relatively low points.

  • PDF

An Analysis of Characteristic and Factor about Middle School Science Descriptive Assessment Items (중학교 과학과 서술형 평가의 문항 특성 및 요인 분석)

  • Kim, Sungki;Choi, Eunju;Paik, Seounghey
    • Journal of the Korean Chemical Society
    • /
    • v.59 no.5
    • /
    • pp.445-453
    • /
    • 2015
  • In 2005, descriptive assessment was introduced to increase students’ higher mental ability like problem-solving ability and creativity. Every year the ratio of descriptive assessment increases and it is regarded as an alternative evaluation to a multiple-choice item which measures simplicity knowledge. Externally the descriptive assessment took a root in school, but we can’t say definitely that it meets its original goal. In this paper, science descriptive assessment items of 5 middle schools in Gyeonggi-do were analysed; examiners was interviewed about how well they understood the characteristic of the assessment items. According to the analysis, characteristic of the items are ① unequal distribution of unit, ② difference of item’s type by unit, and ③ disappearance of measuring higher mental ability. It is considered that there are several factors of these characteristic - the lack of teachers’ ability to make assessment item; understanding of assessment instrument. These factors can be explained by the lack of assessment expertise. So the society’s effort is needed to raise teacher’s ability for the descriptive assessment.

Review of Compositional Evaluation Items for Environmental Conservation Value Assessment Map(ECVAM) of National Land in Korea (국토환경성평가지도 평가항목 구성의 적정성 검토)

  • Jeon, Seong Woo;Lee, Moung Jin;Song, Won Kyong;Sung, Hyun Chan;Park, Wook
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.11 no.1
    • /
    • pp.1-13
    • /
    • 2008
  • This study review of Compositional Evaluation Items for Environmental Conservation Value Assessment Map (ECVAM) in Korea. The ECVAM is composed of legal assessment and environmental/ecological assessment items. ECVAM basically adapts an overlay method for environmental/ecological assessment items. The objective of this study is to suggest supplementary items for the ECVAM with the following process : Overlapping rates of the assessment items in the ECVAM are calculated to understand the grade distribution of the environmental conservation value assessment and to analyze the overlapping rates among the assessment items, as a result it is found that various items are overlapped each other. In order to reflect effectively each assessment item to the ECVAM, Analyzed the overlapping degree among assessment items to be applied to this map. On the concrete we gripped results to be assessed by various items, which were overlapped each other. In order to reflect effectively each assessment item to the environmental conservation value assessment map of national land, we analyzed the overlapping degree on environmental/ecological items, and investigated the grade distribution by field survey. In this study we assessed the ECVAM by 5 kinds of method. Method 1 is Grade 1 areas of each administrative district, Method 2 is Comparing overlapping areas of each assessment items Grade 1, 2 and Permission of each assessment items' duplication, Method 3 is Grade 1, 2 areas by only singular assessment items, Method 4 is Only Grade 1 areas of Method 2 and Method 5 is Only Grade 2 areas of Method 2. As results, Method 1 showed Seoul and other metropolitan cities reveal a high proportion of Grade I regions by the legal assessment items. Kangwon-Do, show a high proportion of Grade I regions by the environmental/ecological assessment item. Method 2 showed 93.4% of diameter Grade II(standard for stability), forest diameter item was accounted for 99.9% by Method 3, Method 4 showed 95.7% of forest diameter and forest density was accounted for 66.4% by Method 5. From now on, this study will contribute to reduce the complexity in the process of manufacturing ECVAM of National Land, and to raise the pliability in the process of managing and updating this map.

An Assessment of the Usefulness of Time of Flight in Magnetic Resonance Angiography Covering the Aortic Arch

  • Yoo, Yeong-Jun;Choi, Sung-Hyun;Dong, Kyung-Rae;Ji, Yun-Sang;Choi, Ji-Won;Ryu, Jae-Kwang
    • Journal of Radiation Industry
    • /
    • v.12 no.4
    • /
    • pp.325-332
    • /
    • 2018
  • Carotid angiography covering the aortic arch includes contrast-enhanced magnetic resonance angiography (CEA), which is applied to a large region and usually employs contrast media. However, the use of contrast media can be dangerous in infants, pregnant women, and patients with chronic renal failure (CRF). Follow-up patients informed of a lesion may also want to avoid constant exposure to contrast media. We aimed to apply time-of-flight (TOF) angiography to a large region and compare its usefulness with that of CEA. Ten patients (mean age, 58 years; range, 45~75 years) who visited our hospital for magnetic resonance angiography (MRA) participated in this study. A 3.0 Tesla Achieva magnetic resonance imaging (MRI) system (Philips, Netherland) and the SENSE NeuroVascular 16-channel coil were employed for both methods. Both methods were applied simultaneously to the same patient. Three TOF stacks were connected to cover the aortic arch through the circle of Willis, and CEA was applied in the same manner. For the quantitative assessment, the acquired images were used to set the regions of interest (ROIs) in the common carotid artery (CCA) bifurcation, internal carotid artery, external carotid artery, middle cerebral artery, and vertebral artery, and to obtain the signal-to-noise ratio (SNR) and the contrast-to-noise ratio (CNR) for the soft tissues. Three radiologists and one radiological resident performed the qualitative assessment on a 5-point scale - 1 point, "very bad"; 2 points, "bad"; 3 points, "average"; 4 points, "good"; and 5 points, "very good" - with regard to 4 items: (1) sharpness, (2) distortion, (3) vein contamination, and (4) expression of peripheral vessels. For the quantitative assessment, we estimated the mean SNR and CNR in each of the 5 ROIs. In general, the mean SNR was higher in TOF angiography (166.1, 205.2, 154.39, 172.23, and 161.95) than in CEA(92.05, 95.43, 84.76, 73.69, and 88.3). Both methods had a similar mean CNR: 67.62, 106.71, 55.9, 73.74, and 63.46 for TOF angiography, and 67.82, 71.19, 60.52, 49.45, and 64.07 for CEA. In all ROIs, the mean SNR was statistically significant (p<0.05), whereas the mean CNR was insignificant (p>0.05). The mean values of TOF angiography and CEA for each item in the qualitative assessment were 4.2 and 4.28, respectively for item 1; 2.93 and 4.55, respectively, for item 2; 4.6 and 3.13, respectively, for item 3; and 2.88 and 4.65, respectively, for item 4. Therefore, TOF angiography had a higher mean for item 3, and CEA had a higher mean for items 2 and 4; there was no significant difference between the two methods for item 1. The results for item 1 were statistically insignificant (p>0.05), whereas the results for items 2~4 were statistically significant (p<0.05). Both methods have advantages and disadvantages and they complement each other. However, CEA is usually applied to a large region covering the aortic arch. Time-of-flight angiography may be useful for people such as infants, pregnant women, CRF patients, and followup patients for whom the use of contrast media can be dangerous or unnecessary, depending on the circumstance.

Development and Evaluation of Criterion-Referenced Performance Assessment Items Based on the 7th National Science Curriculum -Subject Unit of Reproduction and Biological Accumulation- (제7차 교육과정에 근거한 준거지향적 수행평가 문항의 개발과 평가 -고등학교 과학 "생식"과 "생물 농축" 단원을 중심으로-)

  • Chung, Young-Lan;Park, Jin-Joo
    • Journal of The Korean Association For Science Education
    • /
    • v.24 no.3
    • /
    • pp.519-531
    • /
    • 2004
  • In recent years, there has been an increased emphasis on performance assessment to evaluate students' abilities. Our nation has introduced a change in testing and assessment. Additional work on the efficacy, reliability, and comparability in order to develop the performance assessment item has been needed in the enforcement of the 7th National Science Curriculum. Also, criteria for professional and technical standards has been needed to be developed. The purpose of this study was to draw out various key concepts and to develop achievement standards, assessment standards and performance assessment items based on the 7th National Science Curriculum on the subject matter of reproduction(chapter 13) and biological accumulation(chapter 17). And also, this study examined the validity of completed performance assessment items based on classical test theory and polytomous item response theory. Twelve key concepts in chapter 13(reproduction) and four from chapter 17(biological accumulation) were abstracted. Twenty-six achievement standards in chapter 13(reproduction), and nine in chapter 17(biological accumulation) were developed. The achievement standards were determined in terms of knowledge(K), process skill(P) and attitude(A). Twenty-five assessment standards in chapter 13(reproduction) and nine in chapter 17(biological accumulation) were developed. Based on the developed achievement standards and assessment standards, twenty-two performance assessment items(seventeen open-ended questions, three essays, and two portfolios) with concrete grading criteria were developed. Eight open-ended items were applied to 240 10th graders to evaluate reliabilities of the test which consisted of four items per each chapter. The results would be suggested that the applied items were valid for performance assessment because item difficulties and item discriminations were proper. There was not much differences in item discrimination between interpretation from classical test theory and that from polytomous item response theory. However, there were some differences in item difficulties between the interpretations of two theories because the characteristics of examinees were reflected in classical test theory.