• 제목/요약/키워드: 줄표

검색결과 1건 처리시간 0.014초

한국어 텍스트에 사용된 이음표의 자동 전사 (Automatic Transcription of the Union Symbols in Korean Texts)

  • 윤애선;권혁철
    • 한국언어정보학회지:언어와정보
    • /
    • 제7권1호
    • /
    • pp.23-40
    • /
    • 2003
  • In this paper, we have proposed Auto-TUS, an automatic transcription module of three union symbols-hyphen, dash and tilde (‘­’, ‘―’, ‘∼’)-using their linguistic contexts. Few previous studies have discussed the problems of ambiguities in transcribing symbols into Korean alphabetic letters. We have classified six different reading formulae of the union symbols, analyzed the left and right contexts of the symbols, and investigated selection rules and distributions between the symbols and their contexts. Based on these linguistic features, 86 stereotyped patterns, 78 rules and 8 heuristics determining the types of reading formulae are suggested for Auto-TUS. This module works modularly in three steps. The pilot test was conducted with three test suites, which contains respectively 418, 987 and 1,014 clusters of words containing a union symbol. Encouraging results of 97.36%, 98.48%, 96.55% accuracy were obtained for three test suites. Our next phases are to develop a guessing routine for unknown contexts of the union symbols by using statistical information; to refine the proper nouns and terminology detecting module; and to apply Auto-TUS on a larger scale.

  • PDF