• Title/Summary/Keyword: Fake Mobile Colloquial Corpus

Search Result 1, Processing Time 0.023 seconds

An Implementation of a Lightweight Spacing-Error Correction System for Korean (한국어 경량형 띄어쓰기 교정 시스템의 구현)

  • Song, Yeong-Kil;Kim, Hark-Soo
    • The Journal of Korean Association of Computer Education
    • /
    • v.12 no.2
    • /
    • pp.87-96
    • /
    • 2009
  • We propose a Korean spacing-error correction system that requires small memory usage although the proposed method is a mixture of rule-based and statistical methods. In addition, to train the proposed model to be robust in mobile colloquial sentences in which spelling errors and omissions of functional words are frequently occurred, we propose a method to automatically transform typical colloquial corpus to mobile colloquial corpus. The proposed system uses statistical information of syllable uni-grams in order to increase coverages on new syllable patterns. Then, the proposed system uses error correction rules of two or more grams of syllables in order to increase accuracies. In the experiments on fake mobile colloquial sentences, the proposed system showed relatively high accuracy of 92.10% (93.80% in typical colloquial corpus, 94.07% in typical balanced corpus) spite of small memory usage of about 1MB.

  • PDF