Search | Korea Science

A FACETS Analysis of Rater Characteristics and Rater Bias in Measuring L2 Writing Performance

Shin, You-Sun
- English Language & Literature Teaching
- /
- v.16 no.1
- /
- pp.123-142
- /
- 2009
The present study used multi-faceted Rasch measurement to explore the characteristics and bias patterns of non-native raters when they scored L2 writing tasks. Three raters scored 254 writing tasks written by Korean university students on two topics adapted from the TOEFL Test of Written English (TWE). The written products were assessed using a five-category rating scale (Content, Organization, Language in Use, Grammar, and Mechanics). The raters only showed a difference in severity with regard to rating categories but not in task types. Overall, the raters scored Grammar most harshly and Organization most leniently. The results also indicated several bias patterns of ratings with regard to the rating categories and task types. In rater-task bias interactions, each rater showed recurring bias patterns in their rating between two writing tasks. Analysis of rater-category bias interaction showed that the three raters revealed biased patterns across all the rating categories though they were relatively consistent in their rating. The study has implications for the importance of rater training and task selection in L2 writing assessment.
PDF

A Study on the Features of Writing Rater in TOPIK Writing Assessment (한국어능력시험(TOPIK) 쓰기 평가의 채점 특성 연구)

Ahn, Su-hyun;Kim, Chung-sook
- Journal of Korean language education
- /
- v.28 no.1
- /
- pp.173-196
- /
- 2017
Writing is a subjective and performative activity. Writing ability has multi-facets and compoundness. To understand the examinees's writing ability accurately and provide effective writing scores, raters first ought to have the competency regarding assessment. Therefore, this study is significant as a fundamental research about rater's characteristics on the TOPIK writing assessment. 150 scripts of the 47th TOPIK examinees were selected randomly, and were further rated independently by 20 raters. The many-facet Rasch model was used to generate individualized feedback reports on each rater's relative severity and consistency with respect to particular categories of the rating scale. This study was analyzed using the FACETS ver 3.71.4 program. Overfit and misfit raters showed many difficulties for noticing the difference between assessment factors and interpreting the criteria. Writing raters appear to have much confusion when interpreting the assessment criteria, and especially, overfit and misfit teachers interpret the criteria arbitrarily. The main reason of overfit and misfit is the confusion about assessment factors and criteria in finding basis for scoring. Therefore, there needs to be more training and research is needed for raters based on this type of writing assessment characteristics. This study is recognized significantly in that it collectively examined writing assessment characteristics of writing raters, and visually confirmed the assessment error aspects of writing assessment.

A Case Study on Rater Training for Pre-service Korean Language Teacher of Native Speakers and Chinese Speakers (한국인과 중국인 예비 한국어 교사 대상 채점자 교육 사례)

Lee, Duyong
- Journal of Korean language education
- /
- v.29 no.1
- /
- pp.85-108
- /
- 2018
This study pointed out the reality that many novice Korean language teachers who lack rater training are scoring the learners' writing skill. The study performed and analyzed a case where pre-service teachers were educated in order to explore the possibility of promoting rater training in a Korean language teacher training course. The pre-service teachers majoring in Korean language education at the graduate school scored TOPIK compositions and were provided feedback by the FACETS program, which were further discussed at the rater meeting. In three scoring processes, the raters scored with conscious of own rating patterns and showed positive change or over correction due to excessive consciousness. Consequentially, ongoing training can improve rating ability, and considering the fact that professional rater training is hard to progress, the method composed of FACETS analysis and rater training revealed positive effects. On the other hand, the rater training including native Korean and non-native(Chinese) speakers together showed no significant difference by mother tongue but by individual difference. This can be interpreted as a positive implication to the rating reliability of non-native speakers possessing advanced Korean language abilities. However, this must be supplemented through extended research.

Investigating Learners' Perception on Their Engagement in Rating Procedures

Lee, Ho
- English Language & Literature Teaching
- /
- v.13 no.2
- /
- pp.91-108
- /
- 2007
This study investigates learners' perception on their engagement in rating activities in the EFL essay-writing context. The current study aims to address the answers to the following research questions: 1) What attitude do students show about their participation in the rating tasks? and 2) which of three aspects (e.g. the degree of rating experience, the exposure to English composition instruction and learning, and proficiency level) significantly influences learners' rating activities? 104 EFL learners participated in the rater training session. After participants finished rater training session, they rated three sample essays and peer essays using the given scoring guide. Based on the analysis of survey responses that students made, students showed positive attitude toward their engagement in rating tasks. For research question 2, only L2 writing proficiency seriously affected students' perception on the rating tasks. Advanced level of subjects did not feel stressed by a grade of peers as low level of subjects did. They were also critical about the benefits of self- and peer-assessment, suggesting that a peer's feedback on their own essay was not so useful and that a self-rating does not fully help learners identify their writing proficiency.
PDF

Direct Instruction and Use of Online English Writing Software on EMI Class-Takers' Self-Efficacy

Murdoch, Yvette Denise;Kang, Alin
- International Journal of Contents
- /
- v.15 no.4
- /
- pp.97-106
- /
- 2019
EMI (English as a Medium of Instruction) classes are now accepted policy at Korean universities, yet students often struggle with required academic English writings. The present study examined an EMI class that used direct instruction and access to online assistive English writing software. From preliminary analysis, 26 students expressed interest in how an EMI academic writing class could facilitate improved English writing skills. Study participants completed a survey on self-efficacy and learning needs and assignments for an EMI academic writing class. To establish inter-rater reliability, three trained raters assessed the written essays of students prior to and after instructional intervention. Fleiss' Kappas statistics showed moderate reliability. Students' opinions on the use of online software were also analysed. Paired t-test was run on the quality of students' pre- and post-instruction assignments, and there was significant difference in the rated scores. Self-efficacy was found to have moderate positive association with improved post-essay writing scores.
https://doi.org/10.5392/IJoC.2019.15.4.097 인용 PDF KSCI HTML

Measuring plagiarism in the second language essay writing context (영작문 상황에서의 표절 측정의 신뢰성 연구)

Lee, Ho
- English Language & Literature Teaching
- /
- v.12 no.1
- /
- pp.221-238
- /
- 2006
This study investigates the reliability of plagiarism measurement in the ESL essay writing context. The current study aims to address the answers to the following research questions: 1) How does plagiarism measurement affect test reliability in a psychometric view? and 2) how do raters conceive the plagiarism in their analytic scoring? This study uses the mixed-methodology that crosses quantitative-qualitative techniques. Thirty eight international students took an ESL placement writing test offered by the University of Illinois. Two native expert raters rated students' essays in terms of 5 analytic features (organization, content, language use, source use, plagiarism) and made a holistic score using a scoring benchmark. For research question 1, the current study, using G-theory and Multi-facet Rasch model, found that plagiarism measurement threatened test reliability. For research question 2, two native raters and one non-native rater in their email correspondences responded that plagiarism was not a valid analytic area to be measured in a large-scale writing test. They viewed the plagiarism as a difficult measurement are. In conclusion, this study proposes that a systematic training program for avoiding plagiarism should be given to students. In addition, this study suggested that plagiarism is measured reliably in the small-scale classroom test.
PDF

Development of the Korean Handwriting Assessment for Children Using Digital Image Processing

Lee, Cho Hee;Kim, Eun Bin;Lee, Onseok;Kim, Eun Young
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.8
- /
- pp.4241-4254
- /
- 2019
The efficiency and accuracy of handwriting measurement could be improved by adopting digital image processing. This study developed a computer-based Korean Handwriting Assessment tool. Second graders participated in this study by performing writing tasks of consonants, vowels, words, and sentences. We extracted boundary parameters for each letter using digital image processing and calculated the variables of size, size coefficient of variation (CV), misalignment, inter-letter space, inter-word space, and ratio of inter-letter space to inter-word space. Children were also administered traditional handwriting and visuomotor tests. Digital variables from image processing were correlated with these previous tests. Using these correlations, we established a three-point scoring system that computed test scores for each variable. We analyzed inter-rater reliability between the computer rater and human rater and test-retest reliability between the first and second performances. The validity was examined by analyzing the relationship between the Korean Handwriting Assessment and previous handwriting and visuomotor tests. We suggested the Korean Handwriting Assessment to measure size, size consistency, misalignment, inter-letter space, inter-word space, and space ratio using digital image processing. This Korean Handwriting Assessment tool proved to have reliability and validity. It is expected to be useful for assessing children's handwriting.
https://doi.org/10.3837/tiis.2019.08.024 인용 PDF KSCI HTML

Engaging pre-service English teachers in the rubric development and the evaluation of a creative English poetry (예비 영어교사 주도에 의한 영미시 평가표 제작 및 평가 수행에 관한 연구)

Lee, Ho;Jun, So-Yeon
- English Language & Literature Teaching
- /
- v.17 no.4
- /
- pp.339-356
- /
- 2011
This study explored pre-service English teachers' participation in the development of a rubric and examined evaluation of their own English poetry. The current study would investigate: 1) the pre-service English teachers' perception as a rubric developer and self-evaluator, 2) the number of analytic area that the participants included in their rubrics and the scoring scheme that they designed in their rubrics, and 3) the inter-rater differences between self-assessemnt and expert-assessment across analytic areas. Twenty-four EFL learners participated in the current study. The researchers analyzed the learners' own English poetry, their field notes which contained the process of their writing, their rubrics, scores of self-assessment, and expert raters' scores. The results revealed that learners showed positive responses on learner-directed assessment, that 'content' is the most important area, and that inter-rater difference is small across all analytic areas.
PDF

Building an Automated Scoring System for a Single English Sentences (단문형의 영작문 자동 채점 시스템 구축)

Kim, Jee-Eun;Lee, Kong-Joo;Jin, Kyung-Ae
- The KIPS Transactions:PartB
- /
- v.14B no.3 s.113
- /
- pp.223-230
- /
- 2007
The purpose of developing an automated scoring system for English composition is to score the tests for writing English sentences and to give feedback on them without human's efforts. This paper presents an automated system to score English composition, whose input is a single sentence, not an essay. Dealing with a single sentence as an input has some advantages on comparing the input with the given answers by human teachers and giving detailed feedback to the test takers. The system has been developed and tested with the real test data collected through English tests given to the third grade students in junior high school. Two steps of the process are required to score a single sentence. The first process is analyzing the input sentence in order to detect possible errors, such as spelling errors, syntactic errors and so on. The second process is comparing the input sentence with the given answer to identify the differences as errors. The results produced by the system were then compared with those provided by human raters.
https://doi.org/10.3745/KIPSTB.2007.14-B.3.223 인용 PDF KSCI

Search Result 9, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)