Browse > Article

A Method for Spelling Error Correction in Korean Using a Hangul Edit Distance Algorithm  

Bak, Seung Hyeon (조선대학교 소프트웨어융합공학과)
Lee, Eun Ji (조선대학교 컴퓨터공학과)
Kim, Pan Koo (조선대학교 컴퓨터공학과)
Publication Information
Smart Media Journal / v.6, no.1, 2017 , pp. 16-21 More about this Journal
Abstract
Long time has passed since computers which used to be a means of research were commercialized and available for the general public. People used writing instruments to write before computer was commercialized. However, today a growing number of them are using computers to write instead. Computerized word processing helps write faster and reduces fatigue of hands than writing instruments, making it better fit to making long texts. However, word processing programs are more likely to cause spelling errors by the mistake of users. Spelling errors distort the shape of words, making it easy for the writer to find and correct directly, but those caused due to users' lack of knowledge or those hard to find may make it almost impossible to produce a document free of spelling errors. However, spelling errors in important documents such as theses or business proposals may lead to falling reliability. Consequently, it is necessary to conduct research on high-level spelling error correction programs for the general public. This study was designed to produce a system to correct sentence-level spelling errors to normal words with Korean alphabet similarity algorithm. On the basis of findings reported in related literatures that corrected words are significantly similar to misspelled words in form, spelling errors were extracted from a corpus. Extracted corrected words were replaced with misspelled ones to correct spelling errors with spelling error detection algorithm.
Keywords
Spelling Error; Hangul edit distance; Spelling error correction; Natural Language Processing;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 최철, 박세진, 김철중, 권규식, "쿼르타이 키보드에 기초한 인간공학키보드 설계를 위한 오타율 분석," 대한인간공학회 학술대회논문집, 제2000-1권, 제-호, 142-145쪽, 2000년
2 최현수, 권혁철, 윤애선, "동적 윈도우를 갖는 조건부확률 모델을 이용한 한국어 문맥의존 철자오류 규정 규칙의 재현율 향상," 정보과학회논문지, 제4권, 제5호, 629-636쪽, 2015년
3 김경식, 최성기, 권혁철, "극한 언어사용 환경에 적응적인 문맥의존 철자오류 교정 기법," 한국정보과학회 학술발표논문집, 제2015권, 제6호, 654-656쪽, 2015년
4 김민호, 권혁철, 최성기, "어절 N-gram을 이용한 문맥의존 철자오류 교정," 정보과학회논문지, 제414권, 제12호, 1081-1089쪽, 2014년
5 Aminul Islam, Diana Inkpen, "Semantic Text Similarity Using Corpus-Based Word Similarity and String Similarity", ACM Transaction on Knowledge Discovery from Data(TKDD), Vol.2, No.2, pp.1241-1249, 2008.
6 Aminul Islam, Diana Inkpen, "Real-Word Spelling Correction Using Google Web 1T 3-Grams", Proceedings of The 2009 Conference on Empirical Methods in Natural Language Processing, Vol.3, No.3, pp.1241-1249, 2009.
7 김민호, 권경식, 권혁철, "교정 어휘 쌍을 이용한 통계적 문맥 철자오류 교정," 한국정보과학회 학술발표논문집, 제2013권, 제6호, 607-609쪽, 2013년
8 Mark D. Kernighan, Kenneth W. Church, William A. Gale, "A Spelling Correction Program Based on a Noisy Channel Model", Proceedings of The 13th Conference on Computational Linguistics, Vol.2, No.1, 1990.
9 노강호, 김진욱, 김은상, 박근수, 조환규, "한글에 대한 편집 거리 문제," 정보과학회논문지 : 시스템 및 이론, 제37권, 제2호, 103-109쪽, 2010년
10 노강호, 박근수, 조환규, 장소원, "음소의 분류 체계를 이용한 한글 편집거리 알고리즘," 정보과학회논문지 : 시스템 및 이론, 제37권, 제6호, 323-329쪽, 2010년