• Title/Summary/Keyword: 어휘정보

Search Result 1,062, Processing Time 0.024 seconds

A Document Sentiment Classification System Based on the Feature Weighting Method Improved by Measuring Sentence Sentiment Intensity (문장 감정 강도를 반영한 개선된 자질 가중치 기법 기반의 문서 감정 분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.6
    • /
    • pp.491-497
    • /
    • 2009
  • This paper proposes a new feature weighting method for document sentiment classification. The proposed method considers the difference of sentiment intensities among sentences in a document. Sentiment features consist of sentiment vocabulary words and the sentiment intensity scores of them are estimated by the chi-square statistics. Sentiment intensity of each sentence can be measured by using the obtained chi-square statistics value of each sentiment feature. The calculated intensity values of each sentence are finally applied to the TF-IDF weighting method for whole features in the document. In this paper, we evaluate the proposed method using support vector machine. Our experimental results show that the proposed method performs about 2.0% better than the baseline which doesn't consider the sentiment intensity of a sentence.

An Automatic Korean Word Spacing System for Devices with Low Computing Power (저사양 기기를 위한 한국어 자동 띄어쓰기 시스템)

  • Song, Yeong-Kil;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.4
    • /
    • pp.333-340
    • /
    • 2009
  • Most of the previous automatic word spacing systems are not suitable to use for mobile devices with relatively low computing powers because they require many system resources. We propose an automatic word spacing system that requires reasonable memory usage and simple numerical computations for mobile devices with low computing powers. The proposed system is a two step model that consists of a statistical system and a rule-based system. To reduce the memory usage, the statistical system first corrects word spacing errors by using a modified hidden Markov model based on character unigrams. Then, to increase the accuracy, the rule-based system re-corrects miscorrected word spaces by using lexical rules based on character bigrams or more. In the experiments, the proposed system showed relatively high accuracy of 94.14% in spite of small memory usage of about 1MB.

Effective Foreign Language Learning with Situated Cognition in the MOO based Environments (상황인지(Situated Cognition)원리를 적용한 효과적인 외국어 학습 방안 연구: MOO 학습환경을 중심으로)

  • Lee, Seung-Hee;Seo, Yun-Kyoung
    • Journal of The Korean Association of Information Education
    • /
    • v.6 no.1
    • /
    • pp.64-74
    • /
    • 2002
  • The purpose of this paper is to review the importance of situated cognition and the features of MOO(Multi-user Object Oriented)environments for effective foreign language learning. Learning foreign languages is beyond simply recalling for the vocabularies or expression usages of targeted languages. As much the same as children naturally acquire their mother languages among active and social interactions with other surrounding people, foreign languages should be told in the circumstances and contexts for authentic applications of foreign languages. The MOO, one of the virtual realities with spatial metaphors on the text basis, has been gaining high attentions from educational fields, thanks to the strong functions of social contexts and learner interactions. This paper approaches the features of MOO as foreign language learning environments, in terms of activity, context and interaction.

  • PDF

An Experimental Study on Semantic Searches for Image Data Using Structured Social Metadata (구조화된 소셜 메타데이터를 활용한 이미지 자료의 시맨틱 검색에 관한 실험적 연구)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.1
    • /
    • pp.117-135
    • /
    • 2010
  • We designed a structured folksonomy system in which queries can be expanded through tag control; equivalent, synonym or related tags are bound together, in order to improve the retrieval efficiency (recall and precision) of image data. Then, we evaluated the proposed system by comparing it to a tag-based system without tag control in terms of recall, precision, and user satisfaction. Furthermore, we also investigated which query expansion method is the most efficient in terms of retrieval performance. The experimental results showed that the recall, precision, and user satisfaction rates of the proposed system are statistically higher than the rates of the tag-based system, respectively. On the other hand, there are significant differences among the precision rates of query expansion methods but there are no significant differences among their recall rates. The proposed system can be utilized as a guide on how to effectively index and retrieve the digital content of digital library systems in the Library 2.0 era.

A Study on Fun Elements of Web 2.0 Blog Widget (Web 2.0 블로그 위젯의 재미 요소에 대한 연구)

  • Choi, Sung-Kyu;Kim, Kee-Sung;Jang, Seok-Hyun;Whang, Min-Cheol
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.785-790
    • /
    • 2009
  • Widgets are the instrument for representing user's character and embossing the value of blogs. The compound word of the Windows and Gadget the application, widgets are the functional program to displayed on the screen graphical user interface (GUI) tools as a kind of service that user want to see. On the operating system, the Web, and mobile area, widgets offer the delivery of information, convenience and efficiency. However widgets have been never gave satisfaction to user because it focused transmitting information and representing circumstance than fun. This study is for recognized fun elements that user feel interest and categorized fun elements each type of widgets. Fun elements of widget never been defined, we use fun elements on design and product area and emotional word that is representative of affectivity. And we make up an online questionnaire to blog users. The widget selected by popular degree among the domestic widgets and the Japanese widget. And the results of the questionnaire that 5-scales used based on user preferences to identify the elements that are fun.

  • PDF

A Study on the Speech Recognition for DDD Area - Name Using Vector Quantization with Time Information (시간 정보와 VQ를 이용한 DDD 지역명 인식에 관한 연구)

  • LEE S. K.;LEE K. S.;ANN T. O.;CHO H. J.;BYON Y. C.;KIM S. H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.102-112
    • /
    • 1989
  • In this paper, we proposed the study on speaker-independent isolated word recognition for DDD area-name using vector quantization and chose total 146 DDD area-name to recognize words for application of dialing system. We made the codebook using 12th LPC cepstrum coefficients and used the minsum and the minimax method to find the centroid and we applied 3 splitting rule to a codebook generation. The single section and the multi section with time information were used to generate the codebooks and the over-lapped section codebook was used, too. From the experiment result, we proved that the minsum method was better than the minimax method and the evaluation of the system yielded an accuracy of about 90 percents In case of speaker-independent.

  • PDF

A Word Embedding used Word Sense and Feature Mirror Model (단어 의미와 자질 거울 모델을 이용한 단어 임베딩)

  • Lee, JuSang;Shin, JoonChoul;Ock, CheolYoung
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.226-231
    • /
    • 2017
  • Word representation, an important area in natural language processing(NLP) used machine learning, is a method that represents a word not by text but by distinguishable symbol. Existing word embedding employed a large number of corpora to ensure that words are positioned nearby within text. However corpus-based word embedding needs several corpora because of the frequency of word occurrence and increased number of words. In this paper word embedding is done using dictionary definitions and semantic relationship information(hypernyms and antonyms). Words are trained using the feature mirror model(FMM), a modified Skip-Gram(Word2Vec). Sense similar words have similar vector. Furthermore, it was possible to distinguish vectors of antonym words.

Evaluation of the Discordance between Sentence Polarities and Keyword Polarities by Using MUSE Sentiment-Annotated Corpora (MUSE 감성주석코퍼스를 활용한 문장 극성과 키워드 극성간의 불일치 현상에 대한 분석)

  • Cho, Donghee;Shin, Donghyok;Joo, Heejin;Chae, Byoungyeol;Cao, Wenkai;Nam, Jeesun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.195-200
    • /
    • 2016
  • 본 연구는 MUSE 감성 코퍼스를 활용하여 문장의 극성과 키워드의 극성이 얼마만큼 일치하고 일치하지 않은지를 분석함으로써 특히 문장의 극성과 키워드의 극성이 불일치하는 유형에 대한 연구의 필요성을 역설하고자 한다. 본 연구를 위하여 DICORA에서 구축한 MUSE 감성주석코퍼스 가운데 IT 리뷰글 도메인으로부터 긍정 1,257문장, 부정 1,935문장을, 맛집 리뷰글 도메인으로부터는 긍정 2,418문장, 부정 432문장을 추출하였다. UNITEX를 이용하여 LGG를 구축한 후 이를 위의 코퍼스에 적용하여 나타난 양상을 살펴본 결과, 긍 부정 문장에서 반대 극성의 키워드가 실현된 경우는 두 도메인에서 약 4~16%의 비율로 나타났으며, 단일 키워드가 아닌 구나 문장 차원으로 극성이 표현된 경우는 두 도메인에서 약 25~40%의 비교적 높은 비율로 나타났음을 확인하였다. 이를 통해 키워드의 극성에 의존하기 보다는 문장과 키워드의 극성이 일치하지 않는 경우들, 가령 문장 전체의 극성을 전환시키는 극성전환장치(PSD)가 실현된 유형이나 문장 내 극성 어휘가 존재하지 않지만 구 또는 문장 차원의 극성이 표현되는 유형들에 대한 유의미한 연구가 수행되어야 비로소 신뢰할만한 오피니언 자동 분류 시스템의 구현이 가능하다는 것을 알 수 있다.

  • PDF

An Automatic Post-processing Method for Speech Recognition using CRFs and TBL (CRFs와 TBL을 이용한 자동화된 음성인식 후처리 방법)

  • Seon, Choong-Nyoung;Jeong, Hyoung-Il;Seo, Jung-Yun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.9
    • /
    • pp.706-711
    • /
    • 2010
  • In the applications of a human speech interface, reducing the error rate in recognition is the one of the main research issues. Many previous studies attempted to correct errors using post-processing, which is dependent on a manually constructed corpus and correction patterns. We propose an automatically learnable post-processing method that is independent of the characteristics of both the domain and the speech recognizer. We divide the entire post-processing task into two steps: error detection and error correction. We consider the error detection step as a classification problem for which we apply the conditional random fields (CRFs) classifier. Furthermore, we apply transformation-based learning (TBL) to the error correction step. Our experimental results indicate that the proposed method corrects a speech recognizer's insertion, deletion, and substitution errors by 25.85%, 3.57%, and 7.42%, respectively.

A Study on the Improvement of Digital Library System for School Library (학교도서관업무지원시스템(DLS) 개선방안에 관한 연구)

  • Byun, Woo-Yeoul;Lee, Mihwa
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.31-50
    • /
    • 2017
  • This study was to suggest the problems and the improvement plan of Digital Library System (DLS) which has solved the library management and has supported the data building for resource sharing in school libraries since 2001. The 9 DLS committees were interviewed about the current situation of DLS use and the problems of DLS system in the 6 areas of acquisition, cataloging, circulation and discharge, inventory, library statistics, and searching interface as the research methods. Based on the interviews, the improvement plans were suggested as followed. In acquisition, it was to need the acquisition system development and online purchase for users. In cataloging, the improvement of data quality management, and indexes and vocabularies control for upgrade of searching function were needed. The advanced circulation speed in circulation, the restoration of discarded data in inventory and the exact statistic data in library statistics were need to improve the DLS. This study would contribute to the betterment of DLS and increase the use of DLS.