• 제목/요약/키워드: Part-of-Speech Set

검색결과 37건 처리시간 0.023초

한국어 명사의 지식기반 의미중의성 해소를 위한 효과적인 품사집합 (Efficient Part-of-Speech Set for Knowledge-based Word Sense Disambiguation of Korean Nouns)

  • 곽철헌;서영훈;이충희
    • 한국콘텐츠학회논문지
    • /
    • 제16권4호
    • /
    • pp.418-425
    • /
    • 2016
  • 본 논문에서는 지식기반 기법에서 한국어 명사의 의미중의성 해소에 유용한 품사집합을 제시한다. 세종 형태의미분석 말뭉치에서 174,000 문장을 추출하여 테스트 셋으로 이용하고, 표준국어대사전의 뜻풀이와 용례를 이용하여 각 문장의 의미중의성을 해소하였다. 그 결과 전체 테스트 셋의 성능을 가장 좋게하는 15개의 품사집합과 단어별 평균을 가장 높게 하는 17 개의 품사집합이 제시되었다. 실험결과 45 개의 전체 품사집합을 이용하는 것보다 정확도가 최대 12%까지 향상되었다.

Generating a Category Set of Words Using a Hierarchical Part-of-speech System and Tagged Corpus

  • Kojima, Takeyuki;Kotani, Yoshiyuki
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.217-226
    • /
    • 2002
  • In this paper, we propose a method of generating a proper categorization of morphemes by giving a hierarchical part-of-speech system and a corpus tagged using this part-of-speech system. Our method use hierarchical information in the part-of-speech system and statistical information in the corpus to generate a category set. The statistical information is based on the context of occurrence of categories. First, we specify the format of given information. Then, we describe an algorithm to generate a proper categorization. Finally, we present the results of our experiments in applying this method. We obtained a moderately proper categorization and found several candidates for improvement .

  • PDF

품사셋에 의한 운율경계강도의 예측 (Prediction of Prosodic Boundary Strength by means of Three POS(Part of Speech) sets)

  • 엄기완;김진영;김선미;이현복
    • 대한음성학회지:말소리
    • /
    • 제35_36호
    • /
    • pp.145-155
    • /
    • 1998
  • This study intended to determine the most appropriate POS(Part of Speech) sets for predicting prosodic boundary strength efficiently. We used 3-level POB bets which Kim(1997), one of the authors, has devised. Three POS sets differ from each other according to how much grammatical information they have: the first set has maximal syntactic and morphological information which possibly affects prosodic phrasing, and the third set has minimal one. We hand-labelled 150 sentences using each of three POS sets and conducted perception test. Based on the results of the test, stochastic language modeling method was used to predict prosodic boundary strength. The results showed that the use of each POS set led to not too much different efficiency in the prediction, but the second set was a little more efficient than the other two. As far as the complexity in stochastic language modeling is concerned, however, the third set may be also preferable.

  • PDF

우리말 규칙합성에 관한 연구 (II) - 반음절 단위의 음성합성 (Synthesis-by-rule of Korean: Part II - Speech Synthesis Using the Units of Demisyllables)

  • 천강식;이성준;이재홍
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
    • /
    • pp.29-32
    • /
    • 1988
  • A new set of the units of demi-syllables is presented for Korean speech synthesis. The performance of the set of demi-syllable units is compared with that of the set of syllable units in the aspects of the quality of synthesized speech using each set of the units and the size of the computer memory which each set of units occupies. The set of demi-syllable units achieves comparable speech quality and occupies smaller memory size than the set of syllable units.

  • PDF

아이마라어 화자들의 한국어 발성유형 인지 연구 (A study on the perception of Korean phonation types by Aymara subjects)

  • 박한상
    • 말소리와 음성과학
    • /
    • 제8권4호
    • /
    • pp.49-61
    • /
    • 2016
  • The present study investigates the perception of Korean phonation types by native speakers of Aymara. Perception tests were conducted on two sets of Korean speech materials to determine correspondence between Korean and Aymara 3-way contrasts and to find out which of the consonantal and vocalic part of the syllable is more influential in the perception of Korean phonation types. A set of manipulated stimuli, as well as a set of 12 spontaneous words, were prepared for the tests. The first syllable of the 12 Korean bisyllabic words of 3 series of phonation types(Lenis, Aspirated, and Fortis) in 4 places of articulation were split into consonantal and vocalic parts. And then the two parts were combined to form 9 tokens of CV sequences respectively for each place of articulation. Native speakers of Aymara were forced to match Korean stimuli with one of the 15 Aymara words which represent 3 series of consonant types(plain, aspirated, and ejective) in 5 places of articulation(bilabial, alveolar, palatal, velar, and uvular). Results showed that the consonantal part is more influential than the vocalic part to the Aymara subjects' perception of Korean phonation types when the consonantal part is Aspirated in its phonation type, but the vocalic part is more influential than the consonantal part when the consonantal part is Lenis or Fortis in its phonation type. Response analysis showed that Aymara subjects tend to match Korean stops to Aymara ones in such a way that Lenis corresponds to aspirated, Aspirated to aspirated, and Fortis to plain.

An Automatic Tagging System and Environments for Construction of Korean Text Database

  • Lee, Woon-Jae;Choi, Key-Sun;Lim, Yun-Ja;Lee, Yong-Ju;Kwon, Oh-Woog;Kim, Hiong-Geun;Park, Young-Chan
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1082-1087
    • /
    • 1994
  • A set of text database is indispensable to the probabilistic models for speech recognition, linguistic model, and machine translation. We introduce an environment to canstruct text databases : an automatic tagging system and a set of tools for lexical knowledge acquisition, which provides the facilities of automatic part of speech recognition and guessing.

  • PDF

공동이용을 위한 음성DB의 설계 및 구축에 관한 연구 (A Study on the Design and the Construction of a Korean Speech DB for Common Use)

  • 김봉완;김종진;김선태;이용주
    • 한국음향학회지
    • /
    • 제16권4호
    • /
    • pp.35-41
    • /
    • 1997
  • 공동이용 가능한 각종 대량의 음성 데이터를 수록, 보관, 공개하는 것은 연구 개발 과정에서의 이용 및 음성 정보 처리 시스템의 성능평가 양면에서 필요하다. 이러한 공동 음성 데이타 베이스의 구축을 위해서는 발생 가능한 모든 음운환경을 포함하며, 특정 테스크에 집중되지 않는 발성 대상 단어나 문장의 설계가 필요하다. 본 논문에서는 이와같은 목적으로 신문, 소설, 기타 구어자료로부터 수집된 120만여 어절의 텍스트 코퍼스에서 PBW(Phonetically Balanced Word)를 추출하고 이를 발성목록으로 음성DB를 구축한 결과와 구축된 음성DB의 특성을 제시한다.

  • PDF

지연누적에 기반한 화자결정회로망이 도입된 구문독립 화자인식시스템 (Text-Independent Speaker Identification System Using Speaker Decision Network Based on Delayed Summing)

  • 이종은;최진영
    • 한국지능시스템학회논문지
    • /
    • 제8권2호
    • /
    • pp.82-95
    • /
    • 1998
  • 본 논문에서는 구문독립 화지인식 시스템에서 가장 중요한 역할을 하는 분류기를 두 단계로 나누어, 먼저 짧은 구간들에 대해서 각각의 화자에 속하는 정도를 계산하고, 다음에 계산된 결과들을 가지고 주어진 음성구간전체에 대해 가장 가능성이 높은 화자를 선택하는 구조를 제안한다. 첫번째 부분은 학습에 의해 스스로 조기하는 RBFN을 이용하여 구현하고 두번째 부분에서는 MAXNET과 지연합의 조합으로 화자를 결정한다. 이렇게 함으로써 지연합의 개수가 증가함에 따라 인식률이 100%가 되는 것을 모의 실험을 통하여 확인한다. 또한 본 논문에서는 음성의 프랙탈적인 특징이 화자인식에 사용될 수 있는지를 검토한다. 화자인식은 동질의 집단에서 13명의 성인만자의 목소리를 이용하여 닫힌집합(closed-set)의 경우로 모의실험을 하였고, 기존의 특징으로는 선형예측계수(LPC) 와 PC-cepstrum을 사용하였다.

  • PDF

한국어 자동 발음열 생성 시스템을 위한 예외 발음 연구 (A Study on Exceptional Pronunciations For Automatic Korean Pronunciation Generator)

  • 김선희
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.57-67
    • /
    • 2003
  • This paper presents a systematic description of exceptional pronunciations for automatic Korean pronunciation generation. An automatic pronunciation generator in Korean is an essential part of a Korean speech recognition system and a TTS (Text-To-Speech) system. It is composed of a set of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words that have exceptional pronunciations, based on the characteristics of the words of exceptional pronunciation through phonological research and the systematic analysis of the entries of Korean dictionaries. Thus, the method contributes to improve performance of automatic pronunciation generator in Korean as well as the performance of speech recognition system and TTS system in Korean.

  • PDF

제스처 및 음성 인식을 이용한 윈도우 시스템 제어에 관한 연구 (Study about Windows System Control Using Gesture and Speech Recognition)

  • 김주홍;진성일이남호이용범
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 추계종합학술대회 논문집
    • /
    • pp.1289-1292
    • /
    • 1998
  • HCI(human computer interface) technologies have been often implemented using mouse, keyboard and joystick. Because mouse and keyboard are used only in limited situation, More natural HCI methods such as speech based method and gesture based method recently attract wide attention. In this paper, we present multi-modal input system to control Windows system for practical use of multi-media computer. Our multi-modal input system consists of three parts. First one is virtual-hand mouse part. This part is to replace mouse control with a set of gestures. Second one is Windows control system using speech recognition. Third one is Windows control system using gesture recognition. We introduce neural network and HMM methods to recognize speeches and gestures. The results of three parts interface directly to CPU and through Windows.

  • PDF