• Title/Summary/Keyword: word-net

Search Result 258, Processing Time 0.027 seconds

The Review about the Development of Korean Linguistic Inquiry and Word Count (언어적 특성을 이용한 '심리학적 한국어 글분석 프로그램(KLIWC)' 개발 과정에 대한 고찰)

  • Lee Chang H.;Sim Jung-Mi;Yoon Aesun
    • Korean Journal of Cognitive Science
    • /
    • v.16 no.2
    • /
    • pp.93-121
    • /
    • 2005
  • Substantial amounts of research have been accumulated by the attempt to use linguistic styles as the dependent measure in conducting psychological research. This research was condoned to develope a Korean text analysis program(KLIWC) based on the English text analysis program, LIWC(Linguistic Inquiry and Word Count), and the program reflects the Korean linguistic characteristics and culture that is related with language. We made it possible to analyze agglutinative phrase of many morphemes by linguistic tagging, and basic form dictionary and inflection rule were built. In addition, the face-saving weeds and emotional words were included as the analysis variables. The process of development and characteristics of Korean text analysis have been reviewed, and future direction for the improvement of the program has been discussed.

  • PDF

A Study of Word Sense Ambiguation which Affects Efficiency of the Internet-based Information Retrieval (어휘의미 중의성이 인터넷 정보검색 효율에 미치는 영향에 관한 연구)

  • 황상규;오경묵;변영태
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.3
    • /
    • pp.65-82
    • /
    • 1999
  • Internet users are often frustrated when they try to find“right”piece of information quickly. The reason is that the discovery of available and quality based-resources becomes more difficult to end users while the Internet continues to expand rapidly. Not only incorrect keywords and query expression but word sense ambiguation are the cause of dropping-off in efficiency on Internet search. In this paper, studies were conducted to analyze dropping off in efficiency fir Internet search and discussed reducing user s frustration of the Internet and improving their search strategies.

  • PDF

Automatic extraction of similar poetry for study of literary texts: An experiment on Hindi poetry

  • Prakash, Amit;Singh, Niraj Kumar;Saha, Sujan Kumar
    • ETRI Journal
    • /
    • v.44 no.3
    • /
    • pp.413-425
    • /
    • 2022
  • The study of literary texts is one of the earliest disciplines practiced around the globe. Poetry is artistic writing in which words are carefully chosen and arranged for their meaning, sound, and rhythm. Poetry usually has a broad and profound sense that makes it difficult to be interpreted even by humans. The essence of poetry is Rasa, which signifies mood or emotion. In this paper, we propose a poetry classification-based approach to automatically extract similar poems from a repository. Specifically, we perform a novel Rasa-based classification of Hindi poetry. For the task, we primarily used lexical features in a bag-of-words model trained using the support vector machine classifier. In the model, we employed Hindi WordNet, Latent Semantic Indexing, and Word2Vec-based neural word embedding. To extract the rich feature vectors, we prepared a repository containing 37 717 poems collected from various sources. We evaluated the performance of the system on a manually constructed dataset containing 945 Hindi poems. Experimental results demonstrated that the proposed model attained satisfactory performance.

Building Domain Ontology for Semantic Web (시맨틱 웹에서의 도메인 온톨로지 구축 및 적용)

  • Kong, Hyun-Jang;Jung, Kwan-Ho;Shin, Ju-Hyun;Kim, Won-Pil;Kim, Pan-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.05b
    • /
    • pp.919-922
    • /
    • 2003
  • 1990년대 중반부터 최근까지 시맨틱 웹에 대한 많은 관심과 더불어 많은 연구가 진행중이다. 무한한 정보 자원을 가지고 있는 인터넷에서 자원에 대한 효율적 처리가 더욱더 강조된다. 그렇지만 시맨틱 웹에 대한 뚜렷한 결론을 내리기 힘들뿐만 아니라, 지금의 연구들에서는 시맨틱 웹에 대한 전체적 구상에 치중하고 있을 뿐, 세부적인 기술에 관한 연구는 미흡하다 최근까지의 연구의 초점은 주로 XML, XML Schema에서 RDF, RDF Schema 그리고 DAML+OIL에 이르기까지 다양한 마크업 언어의 개발 및 적용에 대한 연구이다. 이러한 연구의 결과 시맨틱 웹에서의 표현을 위한 마크업 언어에 대한 많은 성과를 가져왔지만, 시맨틱 웹의 핵심이 되는 정보의 의미적 표현은 더 많은 연구들이 요구된다. 본 논문은 시맨틱 웹의 핵심적인 부분을 차지하고 있는 온톨로지에 대한 연구이다. 최근 널리 사용되어지고 있는 온톨로지 중 하나인 WordNet을 시맨틱 웹의 온톨로지로 적용함에 있어, 발생하는 문제점을 해결하기 위한 방법을 제시한다. WordNet에 기반 한 도메인 온톨로지의 구축 및 적용에 대한 내용이 이 문제점을 해결하기 위한 본 논문의 요지이다.

  • PDF

Measurement of WSD based Document Similarity using U-WIN (U-WIN을 이용한 WSD 기반의 문서 유사도 측정)

  • Shim, Kang-Seop;Bae, Young-Jun;Ock, Cheol-Young;Choe, Ho-Seop
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.90-95
    • /
    • 2008
  • 이미 국외에서는 WordNet과 같은 의미적 언어자원을 활용한 문서 유사도 측정에 관한 많은 연구가 진행되고 있다. 그러나 국내에서는 아직 WordNet과 같은 언어자원이 부족하여, 이를 바탕으로 한 문서 유사도 측정 방법이나 그 결과를 활용하는 방법에 관한 연구가 미흡하다. 기존에 국내에서 사용된 문서 유사도 측정법들은 대부분 문서 내에 출현하는 어휘들의 의미에 기반하기 보다는, 그 어휘들의 단순 매칭이나 빈도수를 이용한 가중치 측정법, 또는 가중치를 이용한 중요 어휘 추출방법들 이었다. 이 때문에, 기존의 유사도 측정법들은 문서의 문맥정보를 포함하지 못하고, 어휘의 빈도를 구하기 위하여 대용량의 문서집합에 의존적이며, 또한 특정 개념(의미)을 다른 어휘로 표현하거나, 유사/관련 어휘가 사용된 유사 문서에 대한 처리가 미흡하였다. 본 논문에서는 이에 착안하여 한국어 어휘 의미망인 U-WIN과 문맥에 사용된 어휘들의 overlap 정보를 사용하여, 단순히 어휘에 기반하지 않고, 기본적인 문맥정보를 활용하며, 어휘의 의미에 기반을 둔 문서유사도 측정법을 제안한다.

  • PDF

A Study on Sentiment Trend Analysis Method Using Ant Colony Optimization Algorithm and SentiWordNet (개미 군집 최적화 알고리즘과 센티워드넷을 이용한 사용자 감성 동향 분석 방법 연구)

  • Kwon, Kyunglag;Kang, Daehyun;Choi, Subong;Park, Hansaem;Chung, In-Jeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.948-951
    • /
    • 2014
  • 본 논문에서는 개미 군집 최적화 알고리즘과 센티워드넷(SentiWordNet)을 이용한 감성 분석 방법을 제안한다. 먼저, 데이터 수집 단계에서는 소설 웹(예: 페이스북)으로부터 주어 (subject), 서술어(predicate), 목적어(object)의 3 개의 요소로 구성된 RDF (Resource Description Framework)의 형태로 데이터를 수집한다. 그리고 개미 군집 최적화 알고리즘을 이용하여 수집된 RDF 튜플(tuple)을 수치화한 후, 사용자의 감성에 대하여 제안한 수식을 이용하여 페르몬(pheromone)을 계산한다. 센티워드넷을 통하여 얻은 감성 지수를 반영하여 이전 단계에서 계산된 여러 개의 페르몬 값에 대한 전체 감성 지수를 계산한다. 제안한 방법의 타당성 검증을 위하여 전체 감성 지수를 바탕으로 계산된 사용자의 감성 동향이 적절하게 분석됨을 사용자의 실제 생활과의 비교를 통하여 보인다.

A Study on the Application of WebInterphone Under the .NET Environment (.NET하에서의 웹인터폰 응용에 대한 연구)

  • Lee, Jung-Hoon;Kang, Sung-Chun;Lee, Yun-Ho;Noh, Yong-Deok
    • The KIPS Transactions:PartD
    • /
    • v.14D no.2
    • /
    • pp.235-240
    • /
    • 2007
  • WebInterphone is a temporary composite word made with a Web and a Interphone, and the WebInterphone system has been developed as a new ahrdware/software solution under the .NET environment to remove the drawback of a Interphone which is used only if a host and a visitor should be in same place. In the WebInterphone system, a WebInterphone is connected to a home PC which is also connected to Internet such that a host and a visitor could have a real time communications even though nobody is at home. In this paper, the structure of the WebInterphone system and its operation process are discussed.

Semantic Representation of Moving Objectin Video Data Using Motion Ontology (Motion Ontology를 이용한 비디오내 객체 움직임의 의미표현)

  • Shin, Ju-Hyun;Kim, Pan-Koo
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.1
    • /
    • pp.117-127
    • /
    • 2007
  • As the value of the multimedia data is getting high, the study on the semantic recognition and retrieval about the multimedia information is strongly demanded. In this paper, we build the motion ontology and adopt it for representing the meaning of the moving objects in video data. By referencing the WordNet structure, we extend its semantic meaning based on the reclassification of motion verbs, which are used to represent the semantic meaning of moving objects. The represented information is receded in OWL/RDF(S). Here, we could expect the 'Is-A' and 'Equivalent' reasoning of the data as we use the ontologies. And the semantic representation about the moving objects is possible through the video annotation using ontology. And we tested the accuracy of the system comparing with the key-word based system. As a result, we could get the approximately 10% improvement of the system performance.

  • PDF

E-commerce data based Sentiment Analysis Model Implementation using Natural Language Processing Model (자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축)

  • Choi, Jun-Young;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.33-39
    • /
    • 2020
  • In the field of Natural Language Processing, Various research such as Translation, POS Tagging, Q&A, and Sentiment Analysis are globally being carried out. Sentiment Analysis shows high classification performance for English single-domain datasets by pretrained sentence embedding models. In this thesis, the classification performance is compared by Korean E-commerce online dataset with various domain attributes and 6 Neural-Net models are built as BOW (Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], and BERT(KoBERT)[4]. It has been confirmed that the performance of pretrained sentence embedding models are higher than word embedding models. In addition, practical Neural-Net model composition is proposed after comparing classification performance on dataset with 17 categories. Furthermore, the way of compressing sentence embedding model is mentioned as future work, considering inference time against model capacity on real-time service.

Construction of Korean Wordnet "KorLex 1.5" (한국어 어휘의미망 "KorLex 1.5"의 구축)

  • Yoon, Ae-Sun;Hwang, Soon-Hee;Lee, Eun-Ryoung;Kwon, Hyuk-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.1
    • /
    • pp.92-108
    • /
    • 2009
  • The Princeton WordNet (PWN), which was developed during last 20 years since the mid 80, aimed at representing a mental lexicon inside the human mind. Its potentiality, applicability and portability were more appreciated in the fields of NLP and KE than in cognitive psychology. The semantic and knowledge processing is indispensable in order to obtain useful information using human languages, in the CMC and HCI environment. The PWN is able to provide such NLP-based systems with 'concrete' semantic units and their network. Referenced to the PWN, about 50 wordnets of different languages were developed during last 10 years and they enable a variety of multilingual processing applications. This paper aims at describing PWN-referenced Korean Wordnet, KorLex 1.5, which was developed from 2004 to 2007, and which contains currently about 130,000 synsets and 150,000 word senses for nouns, verbs, adjectives, adverbs, and classifiers.