• Title/Summary/Keyword: Opinion Retrieval

Search Result 25, Processing Time 0.022 seconds

A Study on the Characteristics of Opinion Retrieval Using Term Statistical Analysis in Opinion Documents (의견 문서의 단어 통계 분석을 통한 의견 검색 특성에 관한 연구)

  • Han, Kyoung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.11
    • /
    • pp.21-29
    • /
    • 2010
  • Opinion retrieval which searches the opinions expressed in documents by users cannot outperform significantly yet traditional topical retrieval which searches the facts. Therefore, the focus of this paper is to identify the statistical characteristics which can be applied to opinion retrieval by comparing and analyzing the term statistics of opinion and non-opinion documents in the blog domain. The TREC Blogs06 collection and 150 TREC topics are used in the experiments. The difference between term probability distributions in opinion documents is measured by JS divergence, and the difference according to the topic types and topic domains is also investigated. Moreover, the term probabilities of opinion terms are analyzed comparatively. The main findings of this study include the following: it is necessary to consider the topic-specific characteristics for the opinion detection; it is effective to extract positive and negative opinion terms according to the topics; the topic types are complementary to the topic domains; and special attention has to be given to the usage of the positive opinion terms.

Efficient Retrieval of Short Opinion Documents Using Learning to Rank (기계학습을 이용한 단문 오피니언 문서의 효율적 검색 기법)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.117-126
    • /
    • 2013
  • Recently, as Social Network Services(SNS), such as Twitter, Facebook, are becoming more popular, much research has been doing on opinion mining. However, current related researches are mostly focused on sentiment classification or feature selection, but there were few studies about opinion document retrieval. In this paper, we propose a new retrieval method of short opinion documents. Proposed method utilizes previous sentiment classification methodology, and applies several features of documents for evaluating the quality of the opinion documents. For generating the retrieval model, we adopt Learning-to-rank technique and integrate sentiment classification model to Learning-to-rank. Experimental results show that proposed method can be applied successfully in opinion search.

Experimental Study for Effective Combination of Opinion Features (효과적인 의견 자질 결합을 위한 실험적 연구)

  • Han, Kyoung-Soo
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.3
    • /
    • pp.227-239
    • /
    • 2010
  • Opinion retrieval is to retrieve items which are relevant to the user information need topically and include opinion about the topic. This paper aims to find a method to represent user information need for effective opinion retrieval and to analyze the combination methods for opinion features through various experiments. The experiments are carried out in the inference network framework using the Blogs06 collection and 100 TREC test topics. The results show that our suggested representation method based on hidden 'opinion' concept is effective, and the compact model with very small opinion lexicon shows the comparable performance to the previous model on the same test data set.

Fusion Approach to Targeted Opinion Detection in Blogosphere (블로고스피어에서 주제에 관한 의견을 찾는 융합적 의견탐지방법)

  • Yang, Kiduk
    • Journal of Korean Library and Information Science Society
    • /
    • v.46 no.1
    • /
    • pp.321-344
    • /
    • 2015
  • This paper presents a fusion approach to sentiment detection that combines multiple sources of evidence to retrieve blogs that contain opinions on a specific topic. Our approach to finding opinionated blogs on topic consists of first applying traditional information retrieval methods to retrieve blogs on a given topic and then boosting the ranks of opinionated blogs based on the opinion scores computed by multiple sentiment detection methods. Our sentiment detection strategy, whose central idea is to rely on a variety of complementary evidences rather than trying to optimize the utilization of a single source of evidence, includes High Frequency module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinionated documents), Low Frequency module, which makes use of uncommon/rare terms (e.g., "sooo good") that express strong sentiments, IU Module, which leverages n-grams with IU (I and you) anchor terms (e.g., I believe, You will love), Wilson's lexicon module, which uses a collection-independent opinion lexicon constructed from Wilson's subjectivity terms, and Opinion Acronym module, which utilizes a small set of opinion acronyms (e.g., imho). The results of our study show that combining multiple sources of opinion evidence is an effective method for improving opinion detection performance.

An Opinionated Document Retrieval System based on Hybrid Method (혼합 방식에 기반한 의견 문서 검색 시스템)

  • Lee, Seung-Wook;Song, Young-In;Rim, Hae-Chang
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.4
    • /
    • pp.115-129
    • /
    • 2008
  • Recently, as its growth and popularization, the Web is changed into the place where people express, share and debate their opinions rather than the space of information seeking. Accordingly, the needs for searching opinions expressed in the Web are also increasing. However, it is difficult to meet these needs by using a classical information retrieval system that only concerns the relevance between the user's query and documents. Instead, a more advanced system that captures subjective information through documents is required. The proposed system effectively retrieves opinionated documents by utilizing an existing information retrieval system. This paper proposes a kind of hybrid method which can utilize both a dictionary-based opinion analysis technique and a machine learning based opinion analysis technique. Experimental results show that the proposed method is effective in improving the performance.

Automatic Extraction of Opinion Words from Korean Product Reviews Using the k-Structure (k-Structure를 이용한 한국어 상품평 단어 자동 추출 방법)

  • Kang, Han-Hoon;Yoo, Seong-Joon;Han, Dong-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.470-479
    • /
    • 2010
  • In relation to the extraction of opinion words, it may be difficult to directly apply most of the methods suggested in existing English studies to the Korean language. Additionally, the manual method suggested by studies in Korea poses a problem with the extraction of opinion words in that it takes a long time. In addition, English thesaurus-based extraction of Korean opinion words leaves a challenge to reconsider the deterioration of precision attributed to the one to one mismatching between Korean and English words. Studies based on Korean phrase analyzers may potentially fail due to the fact that they select opinion words with a low level of frequency. Therefore, this study will suggest the k-Structure (k=5 or 8) method, which may possibly improve the precision while mutually complementing existing studies in Korea, in automatically extracting opinion words from a simple sentence in a given Korean product review. A simple sentence is defined to be composed of at least 3 words, i.e., a sentence including an opinion word in ${\pm}2$ distance from the attribute name (e.g., the 'battery' of a camera) of a evaluated product (e.g., a 'camera'). In the performance experiment, the precision of those opinion words for 8 previously given attribute names were automatically extracted and estimated for 1,868 product reviews collected from major domestic shopping malls, by using k-Structure. The results showed that k=5 led to a recall of 79.0% and a precision of 87.0%; while k=8 led to a recall of 92.35% and a precision of 89.3%. Also, a test was conducted using PMI-IR (Pointwise Mutual Information - Information Retrieval) out of those methods suggested in English studies, which resulted in a recall of 55% and a precision of 57%.

Development of Korean Opinion Analysis System using Semantic Dictionary and Inverse Opinion Processing (의미 사전과 반전 의견 처리를 이용한 한국어 의견 분석 시스템 개발)

  • Chang, Jae-Khun;Park, Jin-Soo;Ryoo, Seung-Taek
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.8
    • /
    • pp.3070-3075
    • /
    • 2010
  • Through Web 2.0 days, the end users express their opinions and thoughts for blogs and community spaces on the Internet. These opinions and thoughts are used to purchase products, however, users only refer to a few comments not overall opinions. Opinion Analysis System is an opinion search, developed from a natural language search, which analyzes the product's positive or negative evaluations using opinions of products and services on the Internet. In this paper, we suggest a syntactic analysis and inverse processing system that studies and processes 'Positive', 'Negative', 'Neutral' in addition to 'Inverse' information to analyze 'positive' or 'negative' for the core of sentences in Opinion Analysis Service.

Automatic Retrieval of SNS Opinion Document Using Machine Learning Technique (기계학습을 이용한 SNS 오피니언 문서의 자동추출기법)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.5
    • /
    • pp.27-35
    • /
    • 2013
  • Recently, as Social Network Services(SNS) are becoming more popular, much research has been doing on analyzing public opinions from SNS. One of the most important tasks for solving such a problem is to separate opinion(subjective) documents from others(e.g. objective documents) in SNS. In this paper, we propose a new method of retrieving the opinion documents from Twitter. The reason why it is not easy to search or classify the opinion documents in Twitter is due to a lack of publicly available Twitter documents for training. To tackle the problem, at first, we build a machine-learned model for sentiment classification using the external documents similar to Twitter, and then modify the model to separate the opinion documents from Twitter. Experimental results show that proposed method can be applied successfully in opinion classification.

Opinion Retrieval in Twitter Considering Syntactic Relations of Sentiment Phrase (의견 어구의 구문 관계를 고려한 트위터 의견 검색)

  • Kim, Yoonsung;Yang, Min-Chul;Lee, Seung-Wook;Rim, Hae-Chang
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.9
    • /
    • pp.492-497
    • /
    • 2014
  • In this paper, we propose a method of retrieving opinioned tweets in Twitter, which is the one of the popular Social Network Services and shares diverse opinions among various users. In typical opinion retrieval systems, they may consider the presence of sentiment phrases (subjectivity) as the important factor even if the subjective phrases are not related to a given query or speaker. To alleviate these problems, we utilized the syntactic structure of a sentence to identify the relationships between 1) subjectivity-query and 2) subjectivity-speaker and 3) the syntactic role of subjectivity. Besides, our learning-to-rank approach is trained to retrieve opinioned tweets based on query-relevance, textual features, user information, and Twitter-specific features. Experimental results on real world data show that our proposed method can achieve better performance than several baseline methods in terms of precision and nDCG.

A screening study of human factors variables in designing multimedia information retrieval systems (정보습득용 멀티미디어 시스템의 인간공학적 설계변수 선별)

  • 김미정;한성호
    • Proceedings of the ESK Conference
    • /
    • 1995.10a
    • /
    • pp.56-61
    • /
    • 1995
  • Multimedia systems present information by using various media, for example, video, sound, music, animation, movie, etc., in addition to the text which has long been used for conveying the information. Among many multimedia applications, the multimedia information retrieval systems commercialized in the form of multimedia encyclopedia CD-ROMs, benefit by using various media for their ability to present information in an efficient and complete way. But using various media may cause end users' confusion and furthermore, poor user-interface design often exacerbates the systems. For appropriate design of the user interface of multimedia information retrieval systems, we investigated the characteristics of the multimedia information retrieval systems and listed 35 variables that might affect the usability of the user interface. And we selected 10 variables through some procedures such as brainstorming, literature survey, expert opinion, relevance analysis and feasibility analysis, in order to perform a screening study which will remarkably reduce the cost and time in conducting subsequent human factors experiments.

  • PDF