• Title/Summary/Keyword: 줄표

Search Result 1, Processing Time 0.019 seconds

Automatic Transcription of the Union Symbols in Korean Texts (한국어 텍스트에 사용된 이음표의 자동 전사)

  • 윤애선;권혁철
    • Language and Information
    • /
    • v.7 no.1
    • /
    • pp.23-40
    • /
    • 2003
  • In this paper, we have proposed Auto-TUS, an automatic transcription module of three union symbols-hyphen, dash and tilde (‘­’, ‘―’, ‘∼’)-using their linguistic contexts. Few previous studies have discussed the problems of ambiguities in transcribing symbols into Korean alphabetic letters. We have classified six different reading formulae of the union symbols, analyzed the left and right contexts of the symbols, and investigated selection rules and distributions between the symbols and their contexts. Based on these linguistic features, 86 stereotyped patterns, 78 rules and 8 heuristics determining the types of reading formulae are suggested for Auto-TUS. This module works modularly in three steps. The pilot test was conducted with three test suites, which contains respectively 418, 987 and 1,014 clusters of words containing a union symbol. Encouraging results of 97.36%, 98.48%, 96.55% accuracy were obtained for three test suites. Our next phases are to develop a guessing routine for unknown contexts of the union symbols by using statistical information; to refine the proper nouns and terminology detecting module; and to apply Auto-TUS on a larger scale.

  • PDF