• Title/Summary/Keyword: 어휘 분포

Search Result 77, Processing Time 0.02 seconds

AN ANALYSIS OF HONORIFIC MINIMAM FORMS IN KOREAN (굴곡가지의 높임법 ( 존대법 ) 최소형 형성론)

  • Kim, Suk-Deuk
    • Annual Conference on Human and Language Technology
    • /
    • 1989.10a
    • /
    • pp.77-80
    • /
    • 1989
  • 높임법의 최소형 정립은 말본범주의 인지뿐만 아니라, 사전의 어휘항복 설정에 절대적 중요성을 갖는다. 굴곡가지의 높임법 낱덩이, 곧 최소형 인지의 요건은 첫째, 분포상 굴곡가지가 줄기에 직접 통합되어야 한다는 것이며, 둘째, 줄기에 직접 통합되는 요소가 높임법의 의미를 가져야 한다는 것이다. 단순형태소가 줄기에 직접 통합하여 존대의 의미를 가지는 것은 홑최소형이 되고, 다른 것과 의무적으로 통합하여 줄기에 직접 통합되는 것은 겹최소형이다. 의무적인 겹최소형이 다시 독립적으로 설 수 있는 비의무적 요소와 통합하여 새로운 존대의 의미를 생성한다면, 이 또한 겹최소형이 된다. 높임의 최소형은 높임법과 의항법의 이차원의 성격을 띤다. 따라서 높임의 최소형의 전 분포는 높임의 등분과 함께, 의향법 체계에 걸쳐 있다. 최소형은 풀이씨의 종류에 따라 다름도 주의할 일이다. 시상법은 높임법의 구성소일 뿐이며, 또한 그 자체 독립하는 것으로 높임법과 의향법과는 그 차원을 달리한다.

  • PDF

Extension of Verb Patterns Using Passive Affixes (피동 접사를 이용한 동사패턴의 확장)

  • Kim, Chang-Hyun;Yang, Sung-Il;Choi, Sung-Kwon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11a
    • /
    • pp.619-622
    • /
    • 2002
  • 동사패턴은 원시 언어 분석을 위해 동사와 동사의 격성분 및 의미제약을 기술하고, 목적언어 생성을 위해 동사의 대역어 및 격성분들의 생성 위치정보를 기술한다. 이러한 동사패턴의 구축은 시간적, 경제적 부담이 큰 작업이며, 동사패턴 구축의 자동화 혹은 반자동화에 대한 요구는 크다. 본 논문에서는 서술성 명사와 결합하여 동사를 생성하는 접사들인 '-하-, -되-, -받-, -당하-, -드리-'에 대해, 이들 간의 상호 변환 규칙을 이용하여 수동으로 구축된 동사패턴으로부터 새로운 동사패턴을 자동으로 생성한다. 변환 규칙에서는 명사 어휘별 접사 분포 정보와 함께, 접사와 결합된 파생동사의 구문정보가 요구된다. 그러나, 기존의 사전에는 서술성 명사들의 '-하다, -되다' 분포 및 구문정보만이 기술되어 있고, '-받다, -당하다, -드리다'에 대해서는 기술되어 있지 않다. 본 논문에서는 서술성 명사들의 접사 분포 정보 및 구문정보를 파악하고, 이들 간의 상호 변환 규칙을 도출하여 새로운 동사패턴을 생성화는 2단계 작업을 수행한다.

  • PDF

A Study on Developing Sensibility Model for Visual Display (시각 디스플레이에서의 감성 모형 개발 -움직임과 색을 중심으로-)

  • 임은영;조경자;한광희
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.2
    • /
    • pp.1-15
    • /
    • 2004
  • The structure of sensibility from motion was developed for the purpose of understanding relationship between sensibilities and physical factors to apply it to dynamic visual display. Seventy adjectives were collected by assessing adequacy to express sensibilities from motion and reporting sensibilities recalled from dynamic displays with achromatic color. Various motion displays with a moving single dot were rated according to the degree of sensibility corresponding to each adjective, on the basis of the Semantic Differential (SD) method. The results of assessment were analyzed by means of the factor analysis to reduce 70 words into 19 fundamental sensibilities from motion. The Multidimensional Scaling (MDS) technique constructed the sensibility space in motion, in which 19 sensibilities were scattered with two dimensions, active-passive and bright-dark Motion types systemically varied in kinematic factors were placed on the two-dimensional space of motion sensibility, in order to analyze important variables affecting sensibility from motion. Patterns of placement indicate that speed and both of cycle and amplitude in trajectories tend to partially determine sensibility. Although color and motion affected sensibility according to the in dimensions, it seemed that combination of motion and color made each have dominant effect individually in a certain sensibility dimension, motion to active-passive and color to bright-dark.

  • PDF

Rapid Speaker Adaptation Based on Eigenvoice Using Weight Distribution Characteristics (가중치 분포 특성을 이용한 Eigenvoice 기반 고속화자적응)

  • 박종세;김형순;송화전
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.403-407
    • /
    • 2003
  • Recently, eigenvoice approach has been widely used for rapid speaker adaptation. However, even in the eigenvoice approach, Performance improvement using very small amount of adaptation data is relatively small in comparison with that using somewhat large adaptation data because the reliable estimation of weights of eigenvoice is difficult. In this paper, we propose a rapid speaker adaptation method based on eigenvoice using the weight distribution characteristics to improve the performance on a small adaptation data. In the Experimental results on vocabulary-independent word recognition task (using PBW 452 database), the weight threshold method alleviates the problem of relatively low performance for a tiny small adaptation data. When single adaptation word is used, word error rate is reduced about 9-18% by the weight threshold method.

In Out-of Vocabulary Rejection Algorithm by Measure of Normalized improvement using Optimization of Gaussian Model Confidence (미등록어 거절 알고리즘에서 가우시안 모델 최적화를 이용한 신뢰도 정규화 향상)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.12
    • /
    • pp.125-132
    • /
    • 2010
  • In vocabulary recognition has unseen tri-phone appeared when recognition training. This system has not been created beginning estimation figure of model parameter. It's bad points could not be created that model for phoneme data. Therefore it's could not be secured accuracy of Gaussian model. To improve suggested Gaussian model to optimized method of model parameter using probability distribution. To improved of confidence that Gaussian model to optimized of probability distribution to offer by accuracy and to support searching of phoneme data. This paper suggested system performance comparison as a result of recognition improve represent 1.7% by out-of vocabulary rejection algorithm using normalization confidence.

A Study on the Classification System of KDC for School Libraries - Focused on Vocabulary Analysis of Elementary Materials - (학교도서관을 위한 KDC 분류체계에 관한 연구 - 초등학생관련 문헌의 어휘분석을 중심으로 -)

  • Kim, Jeong-Hyen
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.4
    • /
    • pp.171-191
    • /
    • 2004
  • This study presents revision scheme of Korean Decimal Classification appropriate for classification of children-related materials, mainly centered on social science(300) and pure science(400) occupying the majority of children-related materials in school Libraries. Towards this goal, 1 have studied the development and use of classification system for children-related materials available in domestic and overseas school libraries or children's libraries, and researched elementary school 4th, 5th, and 6th grade students' degree of understanding on classification item terms and children-related materials terms used for KDC's social science and Pure science. Based on the results of analysis, f have presented revision scheme of Korean Decimal Classification item terms and class numbers for children-related materials.

  • PDF

Methodology of Online Survey Questionnaire based on Webgame towards Spacial Color Combination and Affective Word (웹게임 기반 온라인 설문조사 방법론 -공간배색과 감성언어를 중심으로-)

  • Kang, Seung-Mook;Kim, Hae-Yoon;Park, Kyeong-Su;Park, Young-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.7
    • /
    • pp.133-141
    • /
    • 2010
  • The purpose of this paper is to suggest one of effective online questionnaire methods by using web based games. The research examines the interrelation between space design element and emotion language in the background actually used in the web games, and suggests new questionnaire methods to overcome the problems of the insincere answers which is the limitation of online questionnaire methods. The paper is to examine the related references, and compare the merits and demerits between printed and online text based questionnaires. Then it suggests the on-line questionnaire methods based web games which can improve errors of the demerits. The emotional words and phrases database is embodied by the interrelation between the emotional words and four spaces such as a dwelling, a tradition, a commerce and a fantasy based on the position decision value of Gaussian distribution. The paper suggests to be utilized for a population calculability system such as a consumers preference test.

Syllable-Based Korean Morphological Analyzer (음절에 기반한 한국어 형태소 분석기)

  • Jang, Dong-Su;Seo, Young-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.331-339
    • /
    • 1993
  • 본 논문에서는 한국어의 음절 특성을 이용한 한국어 형태소 분석기를 제시하였다. 이 형태소 분석기는 품사별 음절 정보, 불규칙 음절 정보, 활용어절 음절 정보, 선어말 어미 음절 정보 등을 이용하여 음절 단위로 형태소 분석을 한다. 음절 단위의 형태소 분석 방법은 음소 단위의 방법보다 형태소 분석시에 생성될 수 있는 잘못된 중간 분석 결과를 크게 감소시켜, 사전 탐색 부담을 최소화한다. 시스템의 사전은 품사별 결합 특성과 사전 표제어의 길이별 분포 특성을 이용하여 구성하였으며, 그 규모는 약 16만 어휘이다. 이러한 사전 구성은 효율적인 사전검색을 제공하며, 특히 철자 검색기와 자동 인덱싱 등의 다양한 응용 시스템 요구를 곧바로 수용할 수 있는 유연성과 효율성을 갖고 있다.

  • PDF

Automatic Processing of Predicative Nouns for Korean Semantic Recognition. (한국어 의미역 인식을 위한 서술성 명사의 자동처리 연구)

  • Lee, Sukeui;Im, Su-Jong
    • Korean Linguistics
    • /
    • v.80
    • /
    • pp.151-175
    • /
    • 2018
  • This paper proposed a method of semantic recognition to improve the extraction of correct answers of the Q&A system through machine learning. For this purpose, the semantic recognition method is described based on the distribution of predicative nouns. Predicative noun vocabularies and sentences were collected from Wikipedia documents. The predicative nouns are typed by analyzing the environment in which the predicative nouns appear in sentences. This paper proposes a semantic recognition method of predicative nouns to which rules can be applied. In Chapter 2, previous studies on predicative nouns were reviewed. Chapter 3 explains how predicative nouns are distributed. In this paper, every predicative nouns that can not be processed by rules are excluded, therefore, the predicative nouns noun forms combined with the case marker '의' were excluded. In Chapter 4, we extracted 728 sentences composed of 10,575 words from Wikipedia. A semantic analysis engine tool of ETRI was used and presented a predicative nouns noun that can be handled semantic recognition language.

Analysis of Keywords in national river occupancy permits by region using text mining and network theory (텍스트 마이닝과 네트워크 이론을 활용한 권역별 국가하천 점용허가 키워드 분석)

  • Seong Yun Jeong
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.185-197
    • /
    • 2023
  • This study was conducted using text mining and network theory to extract useful information for application for occupancy and performance of permit tasks contained in the permit contents from the permit register, which is used only for the simple purpose of recording occupancy permit information. Based on text mining, we analyzed and compared the frequency of vocabulary occurrence and topic modeling in five regions, including Seoul, Gyeonggi, Gyeongsang, Jeolla, Chungcheong, and Gangwon, as well as normalization processes such as stopword removal and morpheme analysis. By applying four types of centrality algorithms, including stage, proximity, mediation, and eigenvector, which are widely used in network theory, we looked at keywords that are in a central position or act as an intermediary in the network. Through a comprehensive analysis of vocabulary appearance frequency, topic modeling, and network centrality, it was found that the 'installation' keyword was the most influential in all regions. This is believed to be the result of the Ministry of Environment's permit management office issuing many permits for constructing facilities or installing structures. In addition, it was found that keywords related to road facilities, flood control facilities, underground facilities, power/communication facilities, sports/park facilities, etc. were at a central position or played a role as an intermediary in topic modeling and networks. Most of the keywords appeared to have a Zipf's law statistical distribution with low frequency of occurrence and low distribution ratio.