• Title/Summary/Keyword: 어휘추출

Search Result 438, Processing Time 0.021 seconds

Semi-automatic Ontology Modeling for VOD Annotation for IPTV (IPTV의 VOD 어노테이션을 위한 반자동 온톨로지 모델링)

  • Choi, Jung-Hwa;Heo, Gil;Park, Young-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.548-557
    • /
    • 2010
  • In this paper, we propose a semi-automatic modeling approach of ontology to annotate VOD to realize the IPTV's intelligent searching. The ontology is made by combining partial tree that extracts hypernym, hyponym, and synonym of keywords related to a service domain from WordNet. Further, we add to the partial tree new keywords that are undefined in WordNet, such as foreign words and words written in Chinese characters. The ontology consists of two parts: generic hierarchy and specific hierarchy. The former is the semantic model of vocabularies such as keywords and contents of keywords. They are defined as classes including property restrictions in the ontology. The latter is generated using the reasoning technique by inferring contents of keywords based on the generic hierarchy. An annotation generates metadata (i.e., contents and genre) of VOD based on the specific hierarchy. The generic hierarchy can be applied to other domains, and the specific hierarchy helps modeling the ontology to fit the service domain. This approach is proved as good to generate metadata independent of any specific domain. As a result, the proposed method produced around 82% precision with 2,400 VOD annotation test data.

Automatic Generation of Voice Web Pages Based on SALT (SALT 기반 음성 웹 페이지의 자동 생성)

  • Ko, You-Jung;Kim, Yoon-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.3
    • /
    • pp.177-184
    • /
    • 2010
  • As a voice browser is introduced, voice dialog application becomes available on the Web environment. The voice dialog application consists of voice Web pages that need to translate the dialog scripts into SALT(Speech Application Language Tags). The current Web pages have been designed for visual. They, however, are potentially capable of using voice dialog. This paper, therefore, proposes an automated voice Web generation method that finds the elements for voice dialog from Web pages based HTML and converts them into SALT. The automatic generation system of a voice Web page consists of a lexical analyzer and a syntactic analyzer that converts a Web page which is described in HTML to voice Web page which is described in HTML+SALT. The converted voice Web page is designed to be able to handle not only the current mouse and keyboard input but also voice dialog.

A Study on the Spoken Korean Citynames Using Multi-Layered Perceptron of Back-Propagation Algorithm (오차 역전파 알고리즘을 갖는 MLP를 이용한 한국 지명 인식에 대한 연구)

  • Song, Do-Sun;Lee, Jae-Gheon;Kim, Seok-Dong;Lee, Haing-Sei
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.6
    • /
    • pp.5-14
    • /
    • 1994
  • This paper is about an experiment of speaker-independent automatic Korean spoken words recognition using Multi-Layered Perceptron and Error Back-propagation algorithm. The object words are 50 citynames of D.D.D local numbers. 43 of those are 2 syllables and the rest 7 are 3 syllables. The words were not segmented into syllables or phonemes, and some feature components extracted from the words in equal gap were applied to the neural network. That led independent result on the speech duration, and the PARCOR coefficients calculated from the frames using linear predictive analysis were employed as feature components. This paper tried to find out the optimum conditions through 4 differerent experiments which are comparison between total and pre-classified training, dependency of recognition rate on the number of frames and PAROCR order, recognition change due to the number of neurons in the hidden layer, and the comparison of the output pattern composition method of output neurons. As a result, the recognition rate of $89.6\%$ is obtaimed through the research.

  • PDF

Evaluation of the Discordance between Sentence Polarities and Keyword Polarities by Using MUSE Sentiment-Annotated Corpora (MUSE 감성주석코퍼스를 활용한 문장 극성과 키워드 극성간의 불일치 현상에 대한 분석)

  • Cho, Donghee;Shin, Donghyok;Joo, Heejin;Chae, Byoungyeol;Cao, Wenkai;Nam, Jeesun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.195-200
    • /
    • 2016
  • 본 연구는 MUSE 감성 코퍼스를 활용하여 문장의 극성과 키워드의 극성이 얼마만큼 일치하고 일치하지 않은지를 분석함으로써 특히 문장의 극성과 키워드의 극성이 불일치하는 유형에 대한 연구의 필요성을 역설하고자 한다. 본 연구를 위하여 DICORA에서 구축한 MUSE 감성주석코퍼스 가운데 IT 리뷰글 도메인으로부터 긍정 1,257문장, 부정 1,935문장을, 맛집 리뷰글 도메인으로부터는 긍정 2,418문장, 부정 432문장을 추출하였다. UNITEX를 이용하여 LGG를 구축한 후 이를 위의 코퍼스에 적용하여 나타난 양상을 살펴본 결과, 긍 부정 문장에서 반대 극성의 키워드가 실현된 경우는 두 도메인에서 약 4~16%의 비율로 나타났으며, 단일 키워드가 아닌 구나 문장 차원으로 극성이 표현된 경우는 두 도메인에서 약 25~40%의 비교적 높은 비율로 나타났음을 확인하였다. 이를 통해 키워드의 극성에 의존하기 보다는 문장과 키워드의 극성이 일치하지 않는 경우들, 가령 문장 전체의 극성을 전환시키는 극성전환장치(PSD)가 실현된 유형이나 문장 내 극성 어휘가 존재하지 않지만 구 또는 문장 차원의 극성이 표현되는 유형들에 대한 유의미한 연구가 수행되어야 비로소 신뢰할만한 오피니언 자동 분류 시스템의 구현이 가능하다는 것을 알 수 있다.

  • PDF

A Design and Implementation of Music & Image Retrieval Recommendation System based on Emotion (감성기반 음악.이미지 검색 추천 시스템 설계 및 구현)

  • Kim, Tae-Yeun;Song, Byoung-Ho;Bae, Sang-Hyun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.1
    • /
    • pp.73-79
    • /
    • 2010
  • Emotion intelligence computing is able to processing of human emotion through it's studying and adaptation. Also, Be able more efficient to interaction of human and computer. As sight and hearing, music & image is constitute of short time and continue for long. Cause to success marketing, understand-translate of humanity emotion. In this paper, Be design of check system that matched music and image by user emotion keyword(irritability, gloom, calmness, joy). Suggested system is definition by 4 stage situations. Then, Using music & image and emotion ontology to retrieval normalized music & image. Also, A sampling of image peculiarity information and similarity measurement is able to get wanted result. At the same time, Matched on one space through pared correspondence analysis and factor analysis for classify image emotion recognition information. Experimentation findings, Suggest system was show 82.4% matching rate about 4 stage emotion condition.

On the Implementation of a Facial Animation Using the Emotional Expression Techniques (FAES : 감성 표현 기법을 이용한 얼굴 애니메이션 구현)

  • Kim Sang-Kil;Min Yong-Sik
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.2
    • /
    • pp.147-155
    • /
    • 2005
  • In this paper, we present a FAES(a Facial Animation with Emotion and Speech) system for speech-driven face animation with emotions. We animate face cartoons not only from input speech, but also based on emotions derived from speech signal. And also our system can ensure smooth transitions and exact representation in animation. To do this, after collecting the training data, we have made the database using SVM(Support Vector Machine) to recognize four different categories of emotions: neutral, dislike, fear and surprise. So that, we can make the system for speech-driven animation with emotions. Also, we trained on Korean young person and focused on only Korean emotional face expressions. Experimental results of our system demonstrate that more emotional areas expanded and the accuracies of the emotional recognition and the continuous speech recognition are respectively increased 7% and 5% more compared with the previous method.

  • PDF

Sentiment Analysis and Opinion Mining: literature analysis during 2007-2016 (감정분석과 오피니언 마이닝: 2007-2016)

  • Li, Jiapei;Li, Xiaomeng;Xiam, Xiam;Kang, Sun-kyung;Lee, Hyun Chang;Shin, Seong-yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.160-161
    • /
    • 2017
  • Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language Opinion mining and sentiment analysis(OMSA) as a research discipline has emerged during last 15 years and provides a methodology to computationally process the unstructured data mainly to extract opinions and identify their sentiments. The relatively new but fast growing research discipline has changed a lot during these years. This paper presents a scientometric analysis of research work done on OMSA during 2007-2016. For the literature analysis, research publications indexed in Web of Science (WoS) database are used as input data. The publication data is analyzed computationally to identify year-wise publication pattern, rate of growth of publications, research areas. More detailed manual analysis of the data is also performed to identify popular approaches (machine learning and lexcon-based) used in these publications, levels (documents, sentences or aspect-level) of sentiment analysis work done and major application areass of OMSA.

  • PDF

Evaluation of the Discordance between Sentence Polarities and Keyword Polarities by Using MUSE Sentiment-Annotated Corpora (MUSE 감성주석코퍼스를 활용한 문장 극성과 키워드 극성간의 불일치 현상에 대한 분석)

  • Cho, Donghee;Shin, Donghyok;Joo, Heejin;Chae, Byoungyeol;Cao, Wenkai;Nam, Jeesun
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.195-200
    • /
    • 2016
  • 본 연구는 MUSE 감성 코퍼스를 활용하여 문장의 극성과 키워드의 극성이 얼마만큼 일치하고 일치하지 않은지를 분석함으로써 특히 문장의 극성과 키워드의 극성이 불일치하는 유형에 대한 연구의 필요성을 역설하고자 한다. 본 연구를 위하여 DICORA에서 구축한 MUSE 감성주석코퍼스 가운데 IT 리뷰글 도메인으로부터 긍정 1,257문장, 부정 1,935문장을, 맛집 리뷰글 도메인으로부터는 긍정 2,418문장, 부정 432문장을 추출하였다. UNITEX를 이용하여 LGG를 구축한 후 이를 위의 코퍼스에 적용하여 나타난 양상을 살펴 본 결과, 긍 부정 문장에서 반대 극성의 키워드가 실현된 경우는 두 도메인에서 약 4~16%의 비율로 나타났으며, 단일 키워드가 아닌 구나 문장 차원으로 극성이 표현된 경우는 두 도메인에서 약 25~40%의 비교적 높은 비율로 나타났음을 확인하였다. 이를 통해 키워드의 극성에 의존하기 보다는 문장과 키워드의 극성이 일치하지 않는 경우들, 가령 문장 전체의 극성을 전환시키는 극성전환장치(PSD)가 실현된 유형이나 문장 내 극성 어휘가 존재하지 않지만 구 또는 문장 차원의 극성이 표현되는 유형들에 대한 유의미한 연구가 수행되어야 비로소 신뢰할만한 오피니언 자동 분류 시스템의 구현이 가능하다는 것을 알 수 있다.

  • PDF

An Analysis of Earth Science Vocabularies Used in the 10th Grade Science Textbooks (10학년 과학 교과서 지구과학 용어 분석)

  • Choi, Haeng-Im;Lee, Hyon-Yong;Cho, Hyun-Jun
    • Journal of the Korean earth science society
    • /
    • v.29 no.4
    • /
    • pp.363-371
    • /
    • 2008
  • The purposes of this study were to analyze the level of Earth science vocabularies in the 10th grade textbooks with the Science Word Analysis (SWA) program and to investigate the vocabularies selected by the 10th grade students as difficult ones. For this purpose, we extracted the Earth science vocabularies from eleven textbooks, and classified into scientific and non-scientific vocabularies with the SWA program based on the standard Korean language dictionary. In addition, we investigated the difficulty of each vocabulary by surveying five hundred sixty students with a questionnaire. Results showed that the frequency of the scientific vocabularies that were beyond the designated level was the largest among any other levels in all textbooks. Most of the vocabularies selected by students as difficult ones to understand were classified into out of the level. From these results, it were suggested that the students' cognitive level should be considered when developing science textbooks and difficult vocabularies should be replaced with easy ones without a change of meaning.

Automatic Extractive Summarization of Newspaper Articles using Activation Degree of 5W1H (육하원칙 활성화도를 이용한 신문기사 자동추출요약)

  • 윤재민;정유진;이종혁
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.505-515
    • /
    • 2004
  • In a newspaper, 5W1H information is the most fundamental and important element for writing and understanding articles. Focusing on such a relation between a newspaper article and the 5W1H, we propose a summarization method based on the activation degree of 5W1H. To overcome problems of the lead-based and the title-based methods, both of which are known to be the most effective in newspaper summarization, sufficient 5W1H information is extracted from both a title and a lead sentence. Moreover, for each sentence, its weight is computed by considering various factors, such as activation degree of 5W1H, the number of 5W1H categories, and its length and position. These factors make a great contribution to the selection of more important sentences, and thus to the improvement of readability of the summarized texts. In an experimental evaluation, the proposed method achieved a precision of 74.7% outperforming the lead-based method. In sum, our 5W1H approach was shown to be promising for automatic summarization of newspaper articles.