• Title/Summary/Keyword: 어휘 자질

Search Result 103, Processing Time 0.026 seconds

A Study on Automatic Expansion of Dialogue Examples Using Logs of a Dialogue System (대화시스템의 로그를 이용한 대화예제의 자동 확충에 관한 연구)

  • Hong, Gum-Won;Lee, Jeong-Hoon;Shin, Jung-Hwi;Lee, Do-Gil;Rim, Hae-Chang
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.257-262
    • /
    • 2009
  • This paper studies an automatic expansion of dialogue examples using the logs of an example-based dialogue system. Conventional approaches to example-based dialogue system manually construct dialogue examples between humans and a Chatbot, which are labor intensive and time consuming. The proposed method automatically classifies natural utterance pairs and adds them into dialogue example database. Experimental results show that lexical, POS and modality features are useful for classifying natural utterance pairs, and prove that the dialogue examples can be automatically expanded using the logs of a dialogue system.

  • PDF

Component Analysis for Constructing an Emotion Ontology (감정 온톨로지의 구축을 위한 구성요소 분석)

  • Yoon, Aesun;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.19-24
    • /
    • 2009
  • 의사소통에서 대화자 간 감정의 이해는 메시지의 내용만큼이나 중요하다. 비언어적 요소에 의해 감정에 관한 더 많은 정보가 전달되고 있기는 하지만, 텍스트에도 화자의 감정을 나타내는 언어적 표지가 다양하고 풍부하게 녹아 들어 있다. 본 연구의 목적은 인간언어공학에 활용할 수 있는 감정 온톨로지를 설계하는 데 있다. 텍스트 기반 감정 처리 분야의 선행 연구가 감정을 분류하고, 각 감정의 서술적 어휘 목록을 작성하고, 이를 텍스트에서 검색함으로써, 추출된 감정의 정확도가 높지 않았다. 이에 비해, 본 연구에서 제안하는 감정 온톨로지는 다음과 같은 장점을 갖는다. 첫째, 감정 표현의 범주를 기술 대상(언어적 vs. 비언어적)과 방식(표현적, 서술적, 도상적)으로 분류하고, 이질적 특성을 갖는 6개 범주 간 상호 대응관계를 설정함으로써, 멀티모달 환경에 적용할 수 있다. 둘째, 세분화된 감정을 분류할 수 있되, 감정 간 차별성을 가질 수 있도록 24개의 감정 명세를 선별하고, 더 섬세하게 감정을 분류할 수 있는 속성으로 강도와 극성을 설정하였다. 셋째, 텍스트에 나타난 감정 표현을 명시적으로 구분할 수 있도록, 경험자 기술 대상과 방식 언어적 자질에 관한 속성을 도입하였다. 이때 본 연구에서 제안하는 감정 온톨로지가 한국어 처리에 국한되지 않고, 다국어 처리에 활용할 수 있도록 확장성을 고려했다.

  • PDF

Age-related Changes in Word Defining Abilities in Concrete and Abstract Nouns with Normal Elderly (노화에 따른 구체명사와 추상명사의 단어정의하기 능력 변화)

  • Kim, Soo Ryon;Kim, HyangHee
    • 재활복지
    • /
    • v.21 no.3
    • /
    • pp.187-207
    • /
    • 2017
  • The purpose of this study was to explore the characteristics of defining concrete and abstract nouns for the elderly. A total of 382 elderly participated in this study and they were classified into four age groups (i.e., over 55 to under 64, over 65 to under 74, over 75 to under 84, and over 85 year-old group). They performed the word definition task, composed of five concrete and five abstract nouns. The total scores and numbers and ratio of core/supplementary meanings were compared among four elderly groups. The frequency and ratio of error types were also examined. The results showed that all four groups had statistically significant differences in total scores, numbers and ratio of core and supplementary meaning of concrete noun definition task. In addition, abstract noun definition performances revealed group differences except the two groups (over 75 to under 84 and over 85-year-old group). The oldest group showed a sharp increase in error production. The highest ratio of error types were personal experience in over 55 to under 64-year-old group, and over 65 to under 74 year-old groups; and for the target word repetition in over 75 to under 84 year-old group; and no response in over 85 year-old group. In conclusion, both concrete and abstract word defining abilities had age-related deterioration. This decline results from impairment in spreading semantic knowledge within semantic network, which is vulnerable to aging. Characteristics of word definition for elderly can provide basic information to understand various neurolinguistic disorders associated with age.

KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)

  • Park, Sang-Min;Na, Chul-Won;Choi, Min-Seong;Lee, Da-Hee;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.219-240
    • /
    • 2018
  • Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.

Bioactivities and Isolation of Functional Compounds from Decay-Resistant Hardwood Species (고내후성 활엽수종의 추출성분을 이용한 신기능성 물질의 분리 및 생리활성)

  • 배영수;이상용;오덕환;최돈하;김영균
    • Journal of Korea Foresty Energy
    • /
    • v.19 no.2
    • /
    • pp.93-101
    • /
    • 2000
  • Wood of Robinia pseudoacacia and bark of Populus alba$\times$P. glandulosa, Fraxinus rhynchophylla and Ulmus davidiana var. japonica were collected and extracted with acetone-water(7:3, v/v) in glass jar to examine whether its bioactive compounds exist. The concentrated extracts were fractionated with hexane, chloroform, ethylacetate and water, and then freeze-dried for column chromatography and bioactive tests. The isolated compounds were sakuranetin-5-O-$\beta$-D-glucopyranoside from Populus alba $\times$Pl glandulosa, 4--ethyoxy-(+)-leucorobinetinidin frm R. pseudoacacia and fraxetion from F. rhynchophylla and were characterized by $^1H$ and$^{13}C $ NMR and positive FAB-MS. Decay-resistant activity was expressed by weight loss ratio and hyphae growth inhibition in the wood dust agar medium inoculated wood rot fungi. R. pseudoacacia showed best anti-decaying property in both test and its methanol untreated samples, indicating higher activity than methanol treated samples in hyphae grwoth test. In antioxidative test, $\alpha$-tocopherol, one of natural antioxidants, and BHT, one of synthetic antioxidants, were used as references to cmpare with the antioxidant activities of the extacted fractions. Ethylacetate fraction of F. rhynchophylla bark indicated the hightest activity in this test and all fractions of R. pseudiacacia extractives also indicated higher activities compared with the other fractions. In the isolated compounds, aesculetin isolated from F. rhynchophylla bark showed best activity and followed by robonetinidin from R. pseudoacaica.

  • PDF

ASPECT SHIFT AND DURATIVE ADVERBIALS

  • 고희정
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2001.06a
    • /
    • pp.97-109
    • /
    • 2001
  • 이 논문은 영어와 한국어에서 나타나는 완수동사(Accomplishment Verb)와 지속부사구(durative adverbial)의 상호작용에 대해서 논한다. Smith(1991)에 의하면 완수동사가 지속부사구와 결합하면, 동사와 부사구의 의미적 자질의 충돌을 피하기 위해서 완수동사의 상변환(Apsect Shift)이 일어난다. 본 논문은 이 상변환이 영어와 한국어에서 어떻게 실현되는가를 보이고, 이를 형식의미론에서 어떻게 기술할 수 있는가를 논한다. 완수동사가 지속부사구와 결합할 때, 영어와 한국어 모두에서 닫힌 완료 관점(closed perfective viewpoint)에서 열린 관점(open viewpoint)으로 관점 변환(viewpoint shift)이 일어난다. 그러나, 한국어의 상변환과 영어의 상변환이 완전히 일치하는 것은 아니다. 한국어의 관점변환은 완료관점도 아니고 미완료 관점도 아닌, 중립관점(neutral viewpoint)으로의 변환인 반면, 영어의 관점변환은 단순히 미완료 관점으로의 변환임이 논의된다. 이 주장은 한국어의 관점변환 구문은 이질적인 다수 사건의 연속 해석(heterogeneous sequential reading of multiple events)을 허용하는 반면, 영어의 관점변환구문은 오직 동질적인 단일 사건의 동시 해석 (homogeneous simultaneous reading of a single event)만을 허용한다는 사실에 의해서 지지 된다. 본 논문은 완수동사가 지속부사구와 결합할 때 일어나는 상변환에 대한 형식의 미론 분석을 Heim & Kratzer (1998)의 틀에서 제시한다. 닫힌 관점에서 열린 관점으로의 상변환은 비가시적 시제 서법 운용자(covert temporal-modal operator)인 IMP를 지속부사구의 논항으로 설정하여 설명한다. IMP는 Dowty (1979)에서 미완료상의 모순(imperfective paradox)을 해결하기 위해서 설정한 PROG를 Heim & Kratzer (1998)의 틀에 맞게 수정한 것이다. IMP는 평가 세계 (evaluation world)를 현실 세계(actual world)에서 가상 관성 세계(possible inertia would)로 변화시켜서, 완수동사의 종결점(ending point)을 현실세계에서 가상의 미래 세계로 움직이는 역할을 한다. 결과적으로, IMP는 완수동사의 닫힌 완료 관점을 현실세계에서는 열린 미완료 관점으로 변환시키되, 가상 관성 세계에서는 그대로 닫힌 관점으로 유지 시키는 효과를 가진다. 한국어와 영어의 관점 변환 구문의 차이는 각 언어의 지속부사구의 어휘 목록의 전제(presupposition)의 차이로 설명된다. 본 논문은 영어의 지속부사구는 논항의 하위간격

  • PDF

Korean Semantic Role Labeling Using Semantic Frames and Synonym Clusters (의미 프레임과 유의어 클러스터를 이용한 한국어 의미역 인식)

  • Lim, Soojong;Lim, Joon-Ho;Lee, Chung-Hee;Kim, Hyun-Ki
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.773-780
    • /
    • 2016
  • Semantic information and features are very important for Semantic Role Labeling(SRL) though many SRL systems based on machine learning mainly adopt lexical and syntactic features. Previous SRL research based on semantic information is very few because using semantic information is very restricted. We proposed the SRL system which adopts semantic information, such as named entity, word sense disambiguation, filtering adjunct role based on sense, synonym cluster, frame extension based on synonym dictionary and joint rule of syntactic-semantic information, and modified verb-specific numbered roles, etc. According to our experimentations, the proposed present method outperforms those of lexical-syntactic based research works by about 3.77 (Korean Propbank) to 8.05 (Exobrain Corpus) F1-scores.

Opinion Mining of Product Reviews using Sentiment Phrase Patterns considered the Endings of Declinable Words (어미변화를 고려한 감성 구문 패턴을 이용한 상품평 의견 분류)

  • Kim, Jung-Ho;Cha, Myung-Hoon;Kim, Myung-Kyu;Chae, Soo-Hoan
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.285-290
    • /
    • 2010
  • 인터넷이 대중화됨에 따라 누구나 쉽게 자신의 의견을 온라인상에 표현할 수 있게 되었다. 그 결과 생각이나 느낌을 나타내는 의견 데이터들의 양이 급속도로 방대해졌으며, 이러한 데이터들을 이용한 여러 응용 사례들의 등장으로, 효율적인 검색 및 자동 분류 기술이 요구되고 있다. 이런 기술적 흐름에 맞추어 의견 데이터 분류에 관한 여러 연구들이 이루어져 왔다. 이러한 의견 분류에 대한 연구들을 살펴보면, 분류를 위해 자질(Feature)로서 사용한 단일어(Single word)가 아닌 2개 이상의 N-gram 단어, 어휘 구문 패턴 및 통사 구문 패턴 등을 사용한다. 특히, 패턴은 단일어나 N-gram 단어에 비해 유연하고, 언어학적으로 풍부한 정보를 표현할 수 있기 때문에 이를 주요 연구 주제로 사용되었다. 그럼에도 불구하고, 이러한 연구들은 주로 영어에 대한 연구들이었으며, 한국어에 패턴을 적용하여 주관성을 갖는 문장을 분류하거나, 극성을 분류하는 연구들은 아직 미비하다. 한국어의 특색으로 한국어는 용언의 활용이 발달되어 있어, 어미의 변화가 다양하며, 그 변화에 따라 의미가 미묘하게 변화한다. 그러나 기존 한국어에 대한 의견 분류 연구들은 단어의 핵심 의미만을 파악하기 위해 어미 부분을 제거하고 어간만을 취해서 처리하여 어미에 대한 의미변화를 고려하지 못하므로 분류 정확도가 영어권에 연구 결과에 비해 떨어진다. 그래서 본 연구는 영어에 적용된 패턴을 이용한 기존 방법들을 정리하고, 그 방법들 중에서 극성을 지닌 문장성분 패턴을 한국어에 적용하였다. 그리고 어미의 변화에 대한 패턴을 추출하여 이 변화가 의견 분류의 성능에 미치는 영향을 분석하였다.

  • PDF

Sentiment Classification considering Korean Features (한국어 특성을 고려한 감성 분류)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.3
    • /
    • pp.449-458
    • /
    • 2010
  • As occasion demands to obtain efficient information from many documents and reviews on the Internet in many kinds of fields, automatic classification of opinion or thought is required. These automatic classification is called sentiment classification, which can be divided into three steps, such as subjective expression classification to extract subjective sentences from documents, sentiment classification to classify whether the polarity of documents is positive or negative, and strength classification to classify whether the documents have weak polarity or strong polarity. The latest studies in Opinion Mining have used N-gram words, lexical phrase pattern, and syntactic phrase pattern, etc. They have not used single word as feature for classification. Especially, patterns have been used frequently as feature because they are more flexible than N-gram words and are also more deterministic than single word. Theses studies are mainly concerned with English, other studies using patterns for Korean are still at an early stage. Although Korean has a slight difference in the meaning between predicates by the change of endings, which is 'Eomi' in Korean, of declinable words, the earlier studies about Korean opinion classification removed endings from predicates only to extract stems. Finally, this study introduces the earlier studies and methods using pattern for English, uses extracted sentimental patterns from Korean documents, and classifies polarities of these documents. In this paper, it also analyses the influence of the change of endings on performances of opinion classification.

  • PDF

Component Analysis for Constructing an Emotion Ontology (감정 온톨로지의 구축을 위한 구성요소 분석)

  • Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.1
    • /
    • pp.157-175
    • /
    • 2010
  • Understanding dialogue participant's emotion is important as well as decoding the explicit message in human communication. It is well known that non-verbal elements are more suitable for conveying speaker's emotions than verbal elements. Written texts, however, contain a variety of linguistic units that express emotions. This study aims at analyzing components for constructing an emotion ontology, that provides us with numerous applications in Human Language Technology. A majority of the previous work in text-based emotion processing focused on the classification of emotions, the construction of a dictionary describing emotion, and the retrieval of those lexica in texts through keyword spotting and/or syntactic parsing techniques. The retrieved or computed emotions based on that process did not show good results in terms of accuracy. Thus, more sophisticate components analysis is proposed and the linguistic factors are introduced in this study. (1) 5 linguistic types of emotion expressions are differentiated in terms of target (verbal/non-verbal) and the method (expressive/descriptive/iconic). The correlations among them as well as their correlation with the non-verbal expressive type are also determined. This characteristic is expected to guarantees more adaptability to our ontology in multi-modal environments. (2) As emotion-related components, this study proposes 24 emotion types, the 5-scale intensity (-2~+2), and the 3-scale polarity (positive/negative/neutral) which can describe a variety of emotions in more detail and in standardized way. (3) We introduce verbal expression-related components, such as 'experiencer', 'description target', 'description method' and 'linguistic features', which can classify and tag appropriately verbal expressions of emotions. (4) Adopting the linguistic tag sets proposed by ISO and TEI and providing the mapping table between our classification of emotions and Plutchik's, our ontology can be easily employed for multilingual processing.

  • PDF