• Title/Summary/Keyword: Feature learning

Search Result 1,924, Processing Time 0.024 seconds

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

A Comparative Study on Buddhist Painting, MokWooDo (牧牛圖: PA Comparative Study on Buddhist Painting, MokWooDo (牧牛圖: Painting of Bull Keeping) and Confucian/Taoist Painting, SipMaDo (十馬圖: Painting of Ten Horses) - Focused on SimBeop (心法: Mind Control Rule) of the Three Schools: Confucianism, Buddhism and Taoism -nd Control Rule) of the Three Schools: Confucianism, Buddhism and Taoism - (불가(佛家) 목우도(牧牛圖)와 유·도(儒·道) 십마도(十馬圖) 비교 연구 - 유불도(儒佛道) 삼가(三家)의 심법(心法)을 중심으로 -)

  • Park, So-Hyun;Lee, Jung-Han
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.40 no.4
    • /
    • pp.67-80
    • /
    • 2022
  • SipWooDo (十牛圖: Painting of Ten Bulls), a Buddhist painting, is a kind of Zen Sect Buddhism painting, which is shown as a mural in many of main halls of Korean Buddhist temples. MokWooDo has been painted since Song Dynasty of China. It paints a cow, a metaphor of mind and a shepherd boy who controls the cow. It comes also with many other types of works such as poetry called GyeSong, HwaWoonSi and etc. That is, it appeared as a pan-cultural phenomenon beyond ideology and nation not limited to Chinese Buddhist ideology of an era. This study, therefore, selects MokWooDo chants that represent Confucianism, Buddhism and Taoism to compare the writing purposes, mind discipline methods and ultimate goals of such chant literatures in order to integrate and comprehend the ideologies of such three schools in the ideologically cultural aspect, which was not fully dealt with in the existing studies. In particular, the study results are: First, the SipWooDo of Buddhist School is classified generally into Bo Myoung's MokWooDo and Kwak Ahm's SimWooDo (尋牛圖: Painting of Searching out a Bull). Zen Sect Buddhism goes toward nirvana through enlightenment. Both MokWooDo and SimWooDo of Buddhist School are the discipline method of JeomSu (漸修: Discipline by Steps). They were made for SuSimJeungDo (修心證道: Enlightenment of Truth by Mind Discipline), which appears different in HwaJe (畫題: Titles on Painting) and GyeSong (偈頌: Poetry Type of Buddhist Chant) between Zen Sect Buddhism and Doctrine Study Based Buddhism, which are different from each other in viewpoints. Second, Bo Myoung's MokWooDo introduces the discipline processes from MiMok (未牧: Before Tamed) to JinGongMyoYu (眞空妙有: True Vacancy is not Separately Existing) of SsangMin (雙泯: the Level where Only Core Image Appears with Every Other Thing Faded out) that lie on the method called BangHalGiYong (棒喝機用: a Way of Using Rod to Scold). On the other side, however, it puts its ultimate goal onto the way to overcome even such core image of SsangMin. Third, Kwak Ahm's SimWooDo shows the discipline processes of JeomSu from SimWoo (尋牛: Searching out a Bull) to IpJeonSuSu (入鄽垂手: Entering into a Place to Exhibit Tools). That is, it puts its ultimate goal onto HwaGwangDongJin (和光同塵: Harmonized with Others not Showing your own Wisdom) where you are going together with ordinary people by going up to the level of 'SangGuBori (上求菩提: Discipline to Go Up to Gain Truth) and HaHwaJungSaeng (下化衆生: Discipline to Go Down to Be with Ordinary People)' through SaGyoIpSeon (捨敎入禪: Entering into Zen Sect Buddhism after Completing a Certain Volume of Doctrine Study), which are working for leading the ordinary people of all to finding out their Buddhist Nature. Fourth, Shimiz Shunryu (清水春流)'s painting YuGaSipMaDo (儒家十馬圖: Painting of Ten Horses of Confucian School) borrowed Bo Myoung's MokWooDo. That is, it borrowed the terms and pictures of Buddhist School. However, it features 'WonBulIpYu (援佛入儒: Enlightenment of Buddhist Nature by Confucianism)', which is based on the process of becoming a greatly wise person through Confucian study to go back to the original good nature. From here, it puts its goal onto becoming a greatly wise person, GunJa who is completely harmonized with truth, through the study of HamYang (涵養: Mind Discipline by Widening Learning and Intelligence) that controls outside mind to make the mind peaceful. Its ultimate goal is in accord with "SangCheonJiJae, MuSeongMuChee (上天之載, 無聲無臭: Heaven Exists in the Sky Upward; It is Difficult to Get the Truth of Nature, which has neither sound nor smell)' words from Zhōngyōng. Fifth, WonMyeongNhoYin (圓明老人)'s painting SangSeungSuJinSamYo (上乘修真三要: Painting of Three Essential Things to Discipline toward Truth) borrowed Bo Myoung's MokWooDo while it consists of totally 13 sheets of picture to preach the painter's will and preference. That is, it features 'WonBulIpDo (援佛入道: Following Buddha to Enter into Truth)' to preach the painter's doctrine of Taoism by borrowing the pictures and poetry type chants of Buddhist School. Taoism aims to become a miraculously powerful Taoist hermit who never dies by Taoist healthcare methods. Therefore, Taoists take the mind discipline called BanHwanSimSeong (返還心性: Returning Back to Original Mind Nature), which makes Taoists go ultimately toward JaGeumSeon (紫金仙) that is the original origin by changing into a saint body that is newly conceived with the vital force of TaeGeuk abandoning the existing mind and body fully. This is a unique feature of Taoism, which puts its ultimate goal onto the way of BeopShinCheongJeong (法身淸淨: Pure and Clean Nature of Buddha) that is in accord with JiDoHoiHong (至道恢弘: Getting to Wide and Big Truth).