• Title/Summary/Keyword: Intelligent Web

Search Result 814, Processing Time 0.025 seconds

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

The Effects of Customer Product Review on Social Presence in Personalized Recommender Systems (개인화 추천시스템에서 고객 제품 리뷰가 사회적 실재감에 미치는 영향)

  • Choi, Jae-Won;Lee, Hong-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.115-130
    • /
    • 2011
  • Many online stores bring features that can build trust in their customers. More so, the number of products or content services on online stores has been increasing rapidly. Hence, personalization on online stores is considered to be an important technology to companies and customers. Recommender systems that provide favorable products and customer product reviews to users are the most commonly used features in this purpose. There are many studies to that investigated the relationship between social presence as an antecedent of trust and provision of recommender systems or customer product reviews. Many online stores have made efforts to increase perceived social presence of their customers through customer reviews, recommender systems, and analyzing associations among products. Primarily because social presence can increase customer trust or reuse intention for online stores. However, there were few studies that investigated the interactions between recommendation type, product type and provision of customer product reviews on social presence. Therefore, one of the purposes of this study is to identify the effects of personalized recommender systems and compare the role of customer reviews with product types. This study performed an experiment to see these interactions. Experimental web pages were developed with $2{\times}2$ factorial setting based on how to provide social presence to users with customer reviews and two product types such as hedonic and utilitarian. The hedonic type was a ringtone chosen from Nate.com while the utilitarian was a TOEIC study aid book selected from Yes24.com. To conduct the experiment, web based experiments were conducted for the participants who have been shopping on the online stores. Participants were a total of 240 and 30% of the participants had the chance of getting the presents. We found out that social presence increased for hedonic products when personalized recommendations were given compared to non.personalized recommendations. Although providing customer reviews for two product types did not significantly increase social presence, provision of customer product reviews for hedonic (ringtone) increased perceived social presence. Otherwise, provision of customer product reviews could not increase social presence when the systems recommend utilitarian products (TOEIC study.aid books). Therefore, it appears that the effects of increasing perceived social presence with customer reviews have a difference for product types. In short, the role of customer reviews could be different based on which product types were considered by customers when they are making a decision related to purchasing on the online stores. Additionally, there were no differences for increasing perceived social presence when providing customer reviews. Our participants might have focused on how recommendations had been provided and what products were recommended because our developed systems were providing recommendations after participants rating their preferences. Thus, the effects of customer reviews could appear more clearly if our participants had actual purchase opportunity for the recommendations. Personalized recommender systems can increase social presence of customers more than nonpersonalized recommender systems by using user preference. Online stores could find out how they can increase perceived social presence and satisfaction of their customers when customers want to find the proper products with recommender systems and customer reviews. In addition, the role of customer reviews of the personalized recommendations can be different based on types of the recommended products. Even if this study conducted two product types such as hedonic and utilitarian, the results revealed that customer reviews for hedonic increased social presence of customers more than customer reviews for utilitarian. Thus, online stores need to consider the role of providing customer reviews with highly personalized information based on their product types when they develop the personalized recommender systems.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.

PIRS : Personalized Information Retrieval System using Adaptive User Profiling and Real-time Filtering for Search Results (적응형 사용자 프로파일기법과 검색 결과에 대한 실시간 필터링을 이용한 개인화 정보검색 시스템)

  • Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.21-41
    • /
    • 2010
  • This paper proposes a system that can serve users with appropriate search results through real time filtering, and implemented adaptive user profiling based personalized information retrieval system(PIRS) using users' implicit feedbacks in order to deal with the problem of existing search systems such as Google or MSN that does not satisfy various user' personal search needs. One of the reasons that existing search systems hard to satisfy various user' personal needs is that it is not easy to recognize users' search intentions because of the uncertainty of search intentions. The uncertainty of search intentions means that users may want to different search results using the same query. For example, when a user inputs "java" query, the user may want to be retrieved "java" results as a computer programming language, a coffee of java, or a island of Indonesia. In other words, this uncertainty is due to ambiguity of search queries. Moreover, if the number of the used words for a query is fewer, this uncertainty will be more increased. Real-time filtering for search results returns only those results that belong to user-selected domain for a given query. Although it looks similar to a general directory search, it is different in that the search is executed for all web documents rather than sites, and each document in the search results is classified into the given domain in real time. By applying information filtering using real time directory classifying technology for search results to personalization, the number of delivering results to users is effectively decreased, and the satisfaction for the results is improved. In this paper, a user preference profile has a hierarchical structure, and consists of domains, used queries, and selected documents. Because the hierarchy structure of user preference profile can apply the context when users perfomed search, the structure is able to deal with the uncertainty of user intentions, when search is carried out, the intention may differ according to the context such as time or place for the same query. Furthermore, this structure is able to more effectively track web documents search behaviors of a user for each domain, and timely recognize the changes of user intentions. An IP address of each device was used to identify each user, and the user preference profile is continuously updated based on the observed user behaviors for search results. Also, we measured user satisfaction for search results by observing the user behaviors for the selected search result. Our proposed system automatically recognizes user preferences by using implicit feedbacks from users such as staying time on the selected search result and the exit condition from the page, and dynamically updates their preferences. Whenever search is performed by a user, our system finds the user preference profile for the given IP address, and if the file is not exist then a new user preference profile is created in the server, otherwise the file is updated with the transmitted information. If the file is not exist in the server, the system provides Google' results to users, and the reflection value is increased/decreased whenever user search. We carried out some experiments to evaluate the performance of adaptive user preference profile technique and real time filtering, and the results are satisfactory. According to our experimental results, participants are satisfied with average 4.7 documents in the top 10 search list by using adaptive user preference profile technique with real time filtering, and this result shows that our method outperforms Google's by 23.2%.

Job Preference Analysis and Job Matching System Development for the Middle Aged Class (중장년층 일자리 요구사항 분석 및 인력 고용 매칭 시스템 개발)

  • Kim, Seongchan;Jang, Jincheul;Kim, Seong Jung;Chin, Hyojin;Yi, Mun Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.247-264
    • /
    • 2016
  • With the rapid acceleration of low-birth rate and population aging, the employment of the neglected groups of people including the middle aged class is a crucial issue in South Korea. In particular, in the 2010s, the number of the middle aged who want to find a new job after retirement age is significantly increasing with the arrival of the retirement time of the baby boom generation (born 1955-1963). Despite the importance of matching jobs to this emerging middle aged class, private job portals as well as the Korean government do not provide any online job service tailored for them. A gigantic amount of job information is available online; however, the current recruiting systems do not meet the demand of the middle aged class as their primary targets are young workers. We are in dire need of a specially designed recruiting system for the middle aged. Meanwhile, when users are searching the desired occupations on the Worknet website, provided by the Korean Ministry of Employment and Labor, users are experiencing discomfort to search for similar jobs because Worknet is providing filtered search results on the basis of exact matches of a preferred job code. Besides, according to our Worknet data analysis, only about 24% of job seekers had landed on a job position consistent with their initial preferred job code while the rest had landed on a position different from their initial preference. To improve the situation, particularly for the middle aged class, we investigate a soft job matching technique by performing the following: 1) we review a user behavior logs of Worknet, which is a public job recruiting system set up by the Korean government and point out key system design implications for the middle aged. Specifically, we analyze the job postings that include preferential tags for the middle aged in order to disclose what types of jobs are in favor of the middle aged; 2) we develope a new occupation classification scheme for the middle aged, Korea Occupation Classification for the Middle-aged (KOCM), based on the similarity between jobs by reorganizing and modifying a general occupation classification scheme. When viewed from the perspective of job placement, an occupation classification scheme is a way to connect the enterprises and job seekers and a basic mechanism for job placement. The key features of KOCM include establishing the Simple Labor category, which is the most requested category by enterprises; and 3) we design MOMA (Middle-aged Occupation Matching Algorithm), which is a hybrid job matching algorithm comprising constraint-based reasoning and case-based reasoning. MOMA incorporates KOCM to expand query to search similar jobs in the database. MOMA utilizes cosine similarity between user requirement and job posting to rank a set of postings in terms of preferred job code, salary, distance, and job type. The developed system using MOMA demonstrates about 20 times of improvement over the hard matching performance. In implementing the algorithm for a web-based application of recruiting system for the middle aged, we also considered the usability issue of making the system easier to use, which is especially important for this particular class of users. That is, we wanted to improve the usability of the system during the job search process for the middle aged users by asking to enter only a few simple and core pieces of information such as preferred job (job code), salary, and (allowable) distance to the working place, enabling the middle aged to find a job suitable to their needs efficiently. The Web site implemented with MOMA should be able to contribute to improving job search of the middle aged class. We also expect the overall approach to be applicable to other groups of people for the improvement of job matching results.

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.

Multi-Category Sentiment Analysis for Social Opinion Related to Artificial Intelligence on Social Media (소셜 미디어 상에서의 인공지능 관련 사회적 여론에 대한 다 범주 감성 분석)

  • Lee, Sang Won;Choi, Chang Wook;Kim, Dong Sung;Yeo, Woon Young;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.51-66
    • /
    • 2018
  • As AI (Artificial Intelligence) technologies have been swiftly evolved, a lot of products and services are under development in various fields for better users' experience. On this technology advance, negative effects of AI technologies also have been discussed actively while there exists positive expectation on them at the same time. For instance, many social issues such as trolley dilemma and system security issues are being debated, whereas autonomous vehicles based on artificial intelligence have had attention in terms of stability increase. Therefore, it needs to check and analyse major social issues on artificial intelligence for their development and societal acceptance. In this paper, multi-categorical sentiment analysis is conducted over online public opinion on artificial intelligence after identifying the trending topics related to artificial intelligence for two years from January 2016 to December 2017, which include the event, match between Lee Sedol and AlphaGo. Using the largest web portal in South Korea, online news, news headlines and news comments were crawled. Considering the importance of trending topics, online public opinion was analysed into seven multiple sentimental categories comprised of anger, dislike, fear, happiness, neutrality, sadness, and surprise by topics, not only two simple positive or negative sentiment. As a result, it was found that the top sentiment is "happiness" in most events and yet sentiments on each keyword are different. In addition, when the research period was divided into four periods, the first half of 2016, the second half of the year, the first half of 2017, and the second half of the year, it is confirmed that the sentiment of 'anger' decreases as goes by time. Based on the results of this analysis, it is possible to grasp various topics and trends currently discussed on artificial intelligence, and it can be used to prepare countermeasures. We hope that we can improve to measure public opinion more precisely in the future by integrating empathy level of news comments.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

The Ontology Based, the Movie Contents Recommendation Scheme, Using Relations of Movie Metadata (온톨로지 기반 영화 메타데이터간 연관성을 활용한 영화 추천 기법)

  • Kim, Jaeyoung;Lee, Seok-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.25-44
    • /
    • 2013
  • Accessing movie contents has become easier and increased with the advent of smart TV, IPTV and web services that are able to be used to search and watch movies. In this situation, there are increasing search for preference movie contents of users. However, since the amount of provided movie contents is too large, the user needs more effort and time for searching the movie contents. Hence, there are a lot of researches for recommendations of personalized item through analysis and clustering of the user preferences and user profiles. In this study, we propose recommendation system which uses ontology based knowledge base. Our ontology can represent not only relations between metadata of movies but also relations between metadata and profile of user. The relation of each metadata can show similarity between movies. In order to build, the knowledge base our ontology model is considered two aspects which are the movie metadata model and the user model. On the part of build the movie metadata model based on ontology, we decide main metadata that are genre, actor/actress, keywords and synopsis. Those affect that users choose the interested movie. And there are demographic information of user and relation between user and movie metadata in user model. In our model, movie ontology model consists of seven concepts (Movie, Genre, Keywords, Synopsis Keywords, Character, and Person), eight attributes (title, rating, limit, description, character name, character description, person job, person name) and ten relations between concepts. For our knowledge base, we input individual data of 14,374 movies for each concept in contents ontology model. This movie metadata knowledge base is used to search the movie that is related to interesting metadata of user. And it can search the similar movie through relations between concepts. We also propose the architecture for movie recommendation. The proposed architecture consists of four components. The first component search candidate movies based the demographic information of the user. In this component, we decide the group of users according to demographic information to recommend the movie for each group and define the rule to decide the group of users. We generate the query that be used to search the candidate movie for recommendation in this component. The second component search candidate movies based user preference. When users choose the movie, users consider metadata such as genre, actor/actress, synopsis, keywords. Users input their preference and then in this component, system search the movie based on users preferences. The proposed system can search the similar movie through relation between concepts, unlike existing movie recommendation systems. Each metadata of recommended candidate movies have weight that will be used for deciding recommendation order. The third component the merges results of first component and second component. In this step, we calculate the weight of movies using the weight value of metadata for each movie. Then we sort movies order by the weight value. The fourth component analyzes result of third component, and then it decides level of the contribution of metadata. And we apply contribution weight to metadata. Finally, we use the result of this step as recommendation for users. We test the usability of the proposed scheme by using web application. We implement that web application for experimental process by using JSP, Java Script and prot$\acute{e}$g$\acute{e}$ API. In our experiment, we collect results of 20 men and woman, ranging in age from 20 to 29. And we use 7,418 movies with rating that is not fewer than 7.0. In order to experiment, we provide Top-5, Top-10 and Top-20 recommended movies to user, and then users choose interested movies. The result of experiment is that average number of to choose interested movie are 2.1 in Top-5, 3.35 in Top-10, 6.35 in Top-20. It is better than results that are yielded by for each metadata.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.