• Title/Summary/Keyword: 어휘데이터베이스

Search Result 79, Processing Time 0.022 seconds

The Effect of Syllable Frequency, Syllable Type and Final Consonant on Hangeul Word and Pseudo-word Lexical Decision: An Analysis of the Korean Lexicon Project Database (한글 두 글자 단어와 비단어의 어휘판단에 글자 빈도, 글자 유형, 받침이 미치는 영향: KLP 자료의 분석)

  • Myong Seok Shin;ChangHo Park
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.4
    • /
    • pp.277-297
    • /
    • 2023
  • This study attempted to find out how lexical decision of two-syllable words or pseudo-words is affected by syllabic information, such as syllable frequency, syllable (i.e. vowel) type, and presence of final consonant (i.e. batchim), through the analysis of the Korean Lexicon Project Database (KLP-DB). Hierarchical regression of RT data showed that lexical decision of words was influenced by the frequency of the first syllable, the syllable type of the first and second syllables, batchim for the first and second syllables, and also by the interaction of the two syllable types and the interaction of syllable frequency and batchim of the second syllable. For pseudo-words lexical decision was influenced by the frequency of the first and second syllables, syllable type of the first syllable, and batchim for the first and second syllables, and also by the interaction of the two syllable frequencies, the interaction of the two syllable types, and the interaction of syllable frequency and batchim of the first syllable. Word frequency had a strong effect on lexical decision of words, while syllabic information had a stable effect on the lexical decision of pseudo-words. These results indicate that syllabic information should be seriously considered in constructing word and pseudo-word lists and interpreting lexical decision time. Understanding the effect of syllabic information will also contribute to the understanding of word recognition process.

Speech Recognition Performance Improvement using a convergence of GMM Phoneme Unit Parameter and Vocabulary Clustering (GMM 음소 단위 파라미터와 어휘 클러스터링을 융합한 음성 인식 성능 향상)

  • Oh, SangYeob
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.8
    • /
    • pp.35-39
    • /
    • 2020
  • DNN error is small compared to the conventional speech recognition system, DNN is difficult to parallel training, often the amount of calculations, and requires a large amount of data obtained. In this paper, we generate a phoneme unit to estimate the GMM parameters with each phoneme model parameters from the GMM to solve the problem efficiently. And it suggests ways to improve performance through clustering for a specific vocabulary to effectively apply them. To this end, using three types of word speech database was to have a DB build vocabulary model, the noise processing to extract feature with Warner filters were used in the speech recognition experiments. Results using the proposed method showed a 97.9% recognition rate in speech recognition. In this paper, additional studies are needed to improve the problems of improved over fitting.

A Study on the Construction of a Korean Concept Dictionary (한국어 개념사전의 구축에 관한 연구)

  • 김수정;김태수
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1998.08a
    • /
    • pp.239-242
    • /
    • 1998
  • 개념 정보를 제공하는 어휘 데이터베이스로 WordNet, CYC, EDR등이 출현하였다. 본 연구는 WordNet의 개념 기술 방식에 따라 한국어 개념 사전을 구축하기 위한 것이다 우선 개념을 분류할 적절한 분류 체계를 설정하고, 연세 말뭉치에서 빈도수가 높은 상위 300개 명사를 추출하여 사전의 뜻풀이에 나타난 명사와 연관관계로 표시된 명사를 함께 제시함으로써 개념을 표현하였다. 이러한 한국어 개념 사전은 의미모호성을 해소하는데 기여할 수 있을 것이다.

  • PDF

Development of a Pseudomorpheme-Based Large Vocabulary Continuous Speech Recognizer (의사형태소 단위 대어휘 연속 음성 인식기 개발)

  • 권오욱
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.320-327
    • /
    • 1998
  • 대어휘 연속음성인식을 목표로 개발한 의사형태소 단위의 인식기를 기술하였다. 먼저 의상형태소를 정의하고, 의사형태소 태거를 간단히 기술하며, 의사형태소의 병합에 의한 인식단위 결정방법, 의사형태소 단위 인식기에서 특히 고려되어야 할 음향모델링, 품사 정보를 이용한 언어모델 및 어절규칙의 적용 방안, 의사형태소 단위 인식을 위한 새로운 탐색기 구조를 기술한다. 약 5,500 어절의 인식어휘를 갖는 여행계획 영역의 대화체 연속음성 데이터베이스를 이용하여 초벌 인식실험을 한 결과, 의사형태소 단위의 인식기의 단어인식률은 66.4%, 어절인식률은 60.0%를 나타내었다.

  • PDF

한중일영 다국어 어휘 데이터베이스의 모형

  • 차재은;강범모
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.06a
    • /
    • pp.48-67
    • /
    • 2002
  • This paper is a report on part of the results of a research project entitled "Research and Model Development for a Multi-Lingual Lexical Database". It Is a six-year project in which we aim to construct a model of a multilingual lexical database of Korean, Chinese, Japanese, and English. Now we have finished the first two-year stage of the project In this paper, we present the goal of the project, the construction model of items in the lexical database, and the possible (semi-)automatic methods of acquisition of lexical information. As an appendix, we present some sample items of the database as an i1lustration.

  • PDF

Natural Language Interface with Combinatory Categorial Grammar (결합범주문법을 이용한 자연언어 인터페이스)

  • 이호동;박종철
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.173-175
    • /
    • 2000
  • 본 연구에서는 전자상거래 데이터베이스를 대상으로 결합범주문법을 이용한 자연언어질의 인터페이스를 구현한다. 이를 위해 질의문을 분석하고 표현 방법을 논의한다. 또한 SQL 형식언어로 변환하기 위한 어휘 표현 및 유도 방법을 보인다. 제안하는 방법은 구문분석 과정에서 SQL 형식의 질의문을 직접 유도하는 것으로 기존 연구에서 제안됐던 중간논리언어 변환단계를 거치지 않으므로 과정이 간결해져 시스템의 성능향상을 가져올 수 있다. 시스템은 웹 기반과 client/server 구조로 구현된다.

  • PDF

Design of the Linguistic Contents of Speech Corpus for Speech Recognition and Synthesis (인식 및 합성용 음성 코퍼스의 발성 목록 설계)

  • 김형주;김봉완;이용주
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2002.05c
    • /
    • pp.330-335
    • /
    • 2002
  • 최근 컴퓨터와 인간간의 대화 수단으로 음성을 활용하는 기술인 음성정보기술이 발달함에 따라 대어휘 연속 음성 인식 및 무제한 어휘 음성 합성의 고도화를 위한 연구가 진행되고 있다. 음성 인식의 경우 HMM으로 대표되는 통계적 수법의 발달에 따라 시스템의 학습을 위해 대량의 음성데이터가 필요하며, 음성 합성의 경우에도 최근 대형의 음성 데이터 베이스로부터 임의 길이의 음성 부분을 골라내어 접속함으로써 좋은 합성 품질을 얻고 있다. 본 논문에서는 이러한 음성 인식 및 합성을 위해 공동으로 사용하기 위한 음성 데이터베이스의 발성 목록을 설계하고 설계된 결과에 대하여 논의한다.

  • PDF

Methodology of Online Survey Questionnaire based on Webgame towards Spacial Color Combination and Affective Word (웹게임 기반 온라인 설문조사 방법론 -공간배색과 감성언어를 중심으로-)

  • Kang, Seung-Mook;Kim, Hae-Yoon;Park, Kyeong-Su;Park, Young-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.7
    • /
    • pp.133-141
    • /
    • 2010
  • The purpose of this paper is to suggest one of effective online questionnaire methods by using web based games. The research examines the interrelation between space design element and emotion language in the background actually used in the web games, and suggests new questionnaire methods to overcome the problems of the insincere answers which is the limitation of online questionnaire methods. The paper is to examine the related references, and compare the merits and demerits between printed and online text based questionnaires. Then it suggests the on-line questionnaire methods based web games which can improve errors of the demerits. The emotional words and phrases database is embodied by the interrelation between the emotional words and four spaces such as a dwelling, a tradition, a commerce and a fantasy based on the position decision value of Gaussian distribution. The paper suggests to be utilized for a population calculability system such as a consumers preference test.

The Development of a System for Product Search Using a Sensibility and Configuration Database on Designing Men's Jackets (신사복 재킷디자인의 감성 및 형상 데이터베이스를 이용한 제품검색 시스템 개발에 관한 연구)

  • Park, Yun-A
    • Journal of the Korean Home Economics Association
    • /
    • v.44 no.4 s.218
    • /
    • pp.133-144
    • /
    • 2006
  • The contemporary period is called "the age of sensibility" in which each individual consumer seeks to have her or his own products. Businesses are in need of design developments with an emphasis on customer sensitivity, and at the same time consumers must understand their own sensitivity to acquire information on designs that suit them. This research established a sensitivity and configuration database on designing men's jackets using the sensitivity engineering approach to clothing design information. The user interface was created on the Internet. Sixty-seven sensitivity terms of vocabulary appropriate for the assessment of men's jacket design were selected, and the different designs were classified into six items and 24 categories. Thirty men's jackets with different designs were produced for sensory testing and the results were analyzed in accordance with general linear I statistics. A sensitivity database was established for each category. My-sql, PHP, Java Script, and Html were used for the configuration database work. The configuration of items/categories, with the most appropriate sensitivity database information assigned to the selected sensitivity vocabulary, was programmed for display on the computer screen. The sensitivity vocabulary of a customer's choice for each factor was selected for the program to run, while the category and product configuration of the men's jacket most suitable for the search was displayed based on the user interface.

A Study on Constructing Korean Language Thesaurus (우리말 시소러스 작성(作成)에 관한 연구(硏究))

  • Jun, Tae-Jung
    • Journal of Information Management
    • /
    • v.21 no.1
    • /
    • pp.53-75
    • /
    • 1990
  • In information storage and retrieval system, controlled vocabularies are generally used to improve the recall ratio and to guide for indexers/users to select correct indexing/searching terms by regulating their forms as well as meanings. Thesaurs, a type of controlled vocabulary, nowdays accept ed by most of database producers. The objective of this study is to develope a method of Korean Language Thesaurus construction. This study covers 1) the definition of thesaurus, 2) a literatural survey on term relations and thesaurus construction method, 3) a suggestion for a practical construction method, 4) the display format of thesuari, and 5) tests and the results.

  • PDF