• Title/Summary/Keyword: Part-of-Speech Set

Search Result 37, Processing Time 0.023 seconds

Efficient Part-of-Speech Set for Knowledge-based Word Sense Disambiguation of Korean Nouns (한국어 명사의 지식기반 의미중의성 해소를 위한 효과적인 품사집합)

  • Kwak, Chul-Heon;Seo, Young-Hoon;Lee, Chung-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.4
    • /
    • pp.418-425
    • /
    • 2016
  • This paper presents the part-of-speech set which is highly efficient at knowledge-based word sense disambiguation for Korean nouns. 174,000 sentences extracted for test set from Sejong semantic tagged corpus whose sense is based on Standard korean dictionary. We disambiguate selected nouns in test set using glosses and examples in Standard Korean dictionary. 15 part-of-speeches which give the best performance for all test set and 17 part-of-speeches which give the best performance for accuracy average of selected nouns are selected. We obtain 12% more performance by those part-of-speech sets than by full 45 part-of-speech set.

Generating a Category Set of Words Using a Hierarchical Part-of-speech System and Tagged Corpus

  • Kojima, Takeyuki;Kotani, Yoshiyuki
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.217-226
    • /
    • 2002
  • In this paper, we propose a method of generating a proper categorization of morphemes by giving a hierarchical part-of-speech system and a corpus tagged using this part-of-speech system. Our method use hierarchical information in the part-of-speech system and statistical information in the corpus to generate a category set. The statistical information is based on the context of occurrence of categories. First, we specify the format of given information. Then, we describe an algorithm to generate a proper categorization. Finally, we present the results of our experiments in applying this method. We obtained a moderately proper categorization and found several candidates for improvement .

  • PDF

Prediction of Prosodic Boundary Strength by means of Three POS(Part of Speech) sets (품사셋에 의한 운율경계강도의 예측)

  • Eom Ki-Wan;Kim Jin-Yeong;Kim Seon-Mi;Lee Hyeon-Bok
    • MALSORI
    • /
    • no.35_36
    • /
    • pp.145-155
    • /
    • 1998
  • This study intended to determine the most appropriate POS(Part of Speech) sets for predicting prosodic boundary strength efficiently. We used 3-level POB bets which Kim(1997), one of the authors, has devised. Three POS sets differ from each other according to how much grammatical information they have: the first set has maximal syntactic and morphological information which possibly affects prosodic phrasing, and the third set has minimal one. We hand-labelled 150 sentences using each of three POS sets and conducted perception test. Based on the results of the test, stochastic language modeling method was used to predict prosodic boundary strength. The results showed that the use of each POS set led to not too much different efficiency in the prediction, but the second set was a little more efficient than the other two. As far as the complexity in stochastic language modeling is concerned, however, the third set may be also preferable.

  • PDF

Synthesis-by-rule of Korean: Part II - Speech Synthesis Using the Units of Demisyllables (우리말 규칙합성에 관한 연구 (II) - 반음절 단위의 음성합성)

  • Cheon, Kang-Sik;Lee, Sung-Jun;Lee, Jae-Hong
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.29-32
    • /
    • 1988
  • A new set of the units of demi-syllables is presented for Korean speech synthesis. The performance of the set of demi-syllable units is compared with that of the set of syllable units in the aspects of the quality of synthesized speech using each set of the units and the size of the computer memory which each set of units occupies. The set of demi-syllable units achieves comparable speech quality and occupies smaller memory size than the set of syllable units.

  • PDF

A study on the perception of Korean phonation types by Aymara subjects (아이마라어 화자들의 한국어 발성유형 인지 연구)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.49-61
    • /
    • 2016
  • The present study investigates the perception of Korean phonation types by native speakers of Aymara. Perception tests were conducted on two sets of Korean speech materials to determine correspondence between Korean and Aymara 3-way contrasts and to find out which of the consonantal and vocalic part of the syllable is more influential in the perception of Korean phonation types. A set of manipulated stimuli, as well as a set of 12 spontaneous words, were prepared for the tests. The first syllable of the 12 Korean bisyllabic words of 3 series of phonation types(Lenis, Aspirated, and Fortis) in 4 places of articulation were split into consonantal and vocalic parts. And then the two parts were combined to form 9 tokens of CV sequences respectively for each place of articulation. Native speakers of Aymara were forced to match Korean stimuli with one of the 15 Aymara words which represent 3 series of consonant types(plain, aspirated, and ejective) in 5 places of articulation(bilabial, alveolar, palatal, velar, and uvular). Results showed that the consonantal part is more influential than the vocalic part to the Aymara subjects' perception of Korean phonation types when the consonantal part is Aspirated in its phonation type, but the vocalic part is more influential than the consonantal part when the consonantal part is Lenis or Fortis in its phonation type. Response analysis showed that Aymara subjects tend to match Korean stops to Aymara ones in such a way that Lenis corresponds to aspirated, Aspirated to aspirated, and Fortis to plain.

An Automatic Tagging System and Environments for Construction of Korean Text Database

  • Lee, Woon-Jae;Choi, Key-Sun;Lim, Yun-Ja;Lee, Yong-Ju;Kwon, Oh-Woog;Kim, Hiong-Geun;Park, Young-Chan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1082-1087
    • /
    • 1994
  • A set of text database is indispensable to the probabilistic models for speech recognition, linguistic model, and machine translation. We introduce an environment to canstruct text databases : an automatic tagging system and a set of tools for lexical knowledge acquisition, which provides the facilities of automatic part of speech recognition and guessing.

  • PDF

A Study on the Design and the Construction of a Korean Speech DB for Common Use (공동이용을 위한 음성DB의 설계 및 구축에 관한 연구)

  • Kim, Bong-Wan;Kim, Jong-Jin;Kim, Sun-Tae;Lee, Yong-Ju
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.35-41
    • /
    • 1997
  • Speech database is an indispensable part of speech research. Speech database is necessary to use in speech research and development processes, and to evaluate performances of various speech-processing systems. To use speech database for common purpose, it is necessary to design utterance list that has all the possible phonetical events in minimal number of words, and is independent of tasks. To meet those restrictions this paper extracts PBW set from large text corpus. Speech database that was constructed using PBW set for utterance list and its properties are described in this paper.

  • PDF

Text-Independent Speaker Identification System Using Speaker Decision Network Based on Delayed Summing (지연누적에 기반한 화자결정회로망이 도입된 구문독립 화자인식시스템)

  • 이종은;최진영
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.82-95
    • /
    • 1998
  • In this paper, we propose a text-independent speaker identification system which has a classifier composed of two parts; to calculate the degree of likeness of each speech frame and to select the most probable speaker from the entire speech duration. The first part is realized using RBFN which is selforganized through learning and in the second part the speaker is determined using a con-tbination of MAXNET and delayed summings. And we use features from linear speech production model and features from fractal geometry. Closed-set speaker identification experiments on 13 male homogeneous speakers show that the proposed techniques can achieve the identification ratio of 100% as the number of delays increases.

  • PDF

A Study on Exceptional Pronunciations For Automatic Korean Pronunciation Generator (한국어 자동 발음열 생성 시스템을 위한 예외 발음 연구)

  • Kim Sunhee
    • MALSORI
    • /
    • no.48
    • /
    • pp.57-67
    • /
    • 2003
  • This paper presents a systematic description of exceptional pronunciations for automatic Korean pronunciation generation. An automatic pronunciation generator in Korean is an essential part of a Korean speech recognition system and a TTS (Text-To-Speech) system. It is composed of a set of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words that have exceptional pronunciations, based on the characteristics of the words of exceptional pronunciation through phonological research and the systematic analysis of the entries of Korean dictionaries. Thus, the method contributes to improve performance of automatic pronunciation generator in Korean as well as the performance of speech recognition system and TTS system in Korean.

  • PDF

Study about Windows System Control Using Gesture and Speech Recognition (제스처 및 음성 인식을 이용한 윈도우 시스템 제어에 관한 연구)

  • 김주홍;진성일이남호이용범
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1289-1292
    • /
    • 1998
  • HCI(human computer interface) technologies have been often implemented using mouse, keyboard and joystick. Because mouse and keyboard are used only in limited situation, More natural HCI methods such as speech based method and gesture based method recently attract wide attention. In this paper, we present multi-modal input system to control Windows system for practical use of multi-media computer. Our multi-modal input system consists of three parts. First one is virtual-hand mouse part. This part is to replace mouse control with a set of gestures. Second one is Windows control system using speech recognition. Third one is Windows control system using gesture recognition. We introduce neural network and HMM methods to recognize speeches and gestures. The results of three parts interface directly to CPU and through Windows.

  • PDF