• Title/Summary/Keyword: Recommender system

Search Result 430, Processing Time 0.031 seconds

Performance Improvement of Collaborative Filtering System Using Associative User′s Clustering Analysis for the Recalculation of Preference and Representative Attribute-Neighborhood (선호도 재계산을 위한 연관 사용자 군집 분석과 Representative Attribute -Neighborhood를 이용한 협력적 필터링 시스템의 성능향상)

  • Jung, Kyung-Yong;Kim, Jin-Su;Kim, Tae-Yong;Lee, Jung-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.287-296
    • /
    • 2003
  • There has been much research focused on collaborative filtering technique in Recommender System. However, these studies have shown the First-Rater Problem and the Sparsity Problem. The main purpose of this Paper is to solve these Problems. In this Paper, we suggest the user's predicting preference method using Bayesian estimated value and the associative user clustering for the recalculation of preference. In addition to this method, to complement a shortcoming, which doesn't regard the attribution of item, we use Representative Attribute-Neighborhood method that is used for the prediction when we find the similar neighborhood through extracting the representative attribution, which most affect the preference. We improved the efficiency by using the associative user's clustering analysis in order to calculate the preference of specific item within the cluster item vector to the collaborative filtering algorithm. Besides, for the problem of the Sparsity and First-Rater, through using Association Rule Hypergraph Partitioning algorithm associative users are clustered according to the genre. New users are classified into one of these genres by Naive Bayes classifier. In addition, in order to get the similarity value between users belonged to the classified genre and new users, and this paper allows the different estimated value to item which user evaluated through Naive Bayes learning. As applying the preference granted the estimated value to Pearson correlation coefficient, it can make the higher accuracy because the errors that cause the missing value come less. We evaluate our method on a large collaborative filtering database of user rating and it significantly outperforms previous proposed method.

Relationship Analysis between Malware and Sybil for Android Apps Recommender System (안드로이드 앱 추천 시스템을 위한 Sybil공격과 Malware의 관계 분석)

  • Oh, Hayoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.26 no.5
    • /
    • pp.1235-1241
    • /
    • 2016
  • Personalized App recommendation system is recently famous since the number of various apps that can be used in smart phones that increases exponentially. However, the site users using google play site with malwares have experienced severe damages of privacy exposure and extortion as well as a simple damage of satisfaction descent at the same time. In addition, Sybil attack (Sybil) manipulating the score (rating) of each app with falmay also present because of the social networks development. Up until now, the sybil detection studies and malicious apps studies have been conducted independently. But it is important to determine finally the existence of intelligent attack with Sybil and malware simultaneously when we consider the intelligent attack types in real-time. Therefore, in this paper we experimentally evaluate the relationship between malware and sybils based on real cralwed dataset of goodlplay. Through the extensive evaluations, the correlation between malware and sybils is low for malware providers to hide themselves from Anti-Virus (AV).

The Construction of Multiform User Profiles Based on Transaction for Effective Recommendation and Segmentation (효과적인 추천과 세분화를 위한 트랜잭션 기반 여러 형태 사용자 프로파일의 구축)

  • Koh, Jae-Jin;An, Hyoung-Keun
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.661-670
    • /
    • 2006
  • With the development of e-Commerce and the proliferation of easily accessible information, information filtering systems such as recommender and SDI systems have become popular to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. Until now, many information filtering methods have been proposed to support filtering systems. XML is emerging as a new standard for information. Recently, filtering systems need new approaches in dealing with XML documents. So, in this paper our system suggests a method to create multiform user profiles with XML's ability to represent structure. This system consists of two parts; one is an administrator profile definition part that an administrator defines to analyze users purchase pattern before a transaction such as purchase happens directly. an other is a user profile creation part module which is applied by the defined profile. Administrator profiles are made from DTD information and it is supposed to point the specific part of a document conforming to the DTD. Proposed system builds user's profile more accurately to get adaptability for user's behavior of buying and provide useful product information without inefficient searching based on such user's profile.

Social Tagging-based Recommendation Platform for Patented Technology Transfer (특허의 기술이전 활성화를 위한 소셜 태깅기반 지적재산권 추천플랫폼)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.53-77
    • /
    • 2015
  • Korea has witnessed an increasing number of domestic patent applications, but a majority of them are not utilized to their maximum potential but end up becoming obsolete. According to the 2012 National Congress' Inspection of Administration, about 73% of patents possessed by universities and public-funded research institutions failed to lead to creating social values, but remain latent. One of the main problem of this issue is that patent creators such as individual researcher, university, or research institution lack abilities to commercialize their patents into viable businesses with those enterprises that are in need of them. Also, for enterprises side, it is hard to find the appropriate patents by searching keywords on all such occasions. This system proposes a patent recommendation system that can identify and recommend intellectual rights appropriate to users' interested fields among a rapidly accumulating number of patent assets in a more easy and efficient manner. The proposed system extracts core contents and technology sectors from the existing pool of patents, and combines it with secondary social knowledge, which derives from tags information created by users, in order to find the best patents recommended for users. That is to say, in an early stage where there is no accumulated tag information, the recommendation is done by utilizing content characteristics, which are identified through an analysis of key words contained in such parameters as 'Title of Invention' and 'Claim' among the various patent attributes. In order to do this, the suggested system extracts only nouns from patents and assigns a weight to each noun according to the importance of it in all patents by performing TF-IDF analysis. After that, it finds patents which have similar weights with preferred patents by a user. In this paper, this similarity is called a "Domain Similarity". Next, the suggested system extract technology sector's characteristics from patent document by analyzing the international technology classification code (International Patent Classification, IPC). Every patents have more than one IPC, and each user can attach more than one tag to the patents they like. Thus, each user has a set of IPC codes included in tagged patents. The suggested system manages this IPC set to analyze technology preference of each user and find the well-fitted patents for them. In order to do this, the suggeted system calcuates a 'Technology_Similarity' between a set of IPC codes and IPC codes contained in all other patents. After that, when the tag information of multiple users are accumulated, the system expands the recommendations in consideration of other users' social tag information relating to the patent that is tagged by a concerned user. The similarity between tag information of perferred 'patents by user and other patents are called a 'Social Simialrity' in this paper. Lastly, a 'Total Similarity' are calculated by adding these three differenent similarites and patents having the highest 'Total Similarity' are recommended to each user. The suggested system are applied to a total of 1,638 korean patents obtained from the Korea Industrial Property Rights Information Service (KIPRIS) run by the Korea Intellectual Property Office. However, since this original dataset does not include tag information, we create virtual tag information and utilized this to construct the semi-virtual dataset. The proposed recommendation algorithm was implemented with JAVA, a computer programming language, and a prototype graphic user interface was also designed for this study. As the proposed system did not have dependent variables and uses virtual data, it is impossible to verify the recommendation system with a statistical method. Therefore, the study uses a scenario test method to verify the operational feasibility and recommendation effectiveness of the system. The results of this study are expected to improve the possibility of matching promising patents with the best suitable businesses. It is assumed that users' experiential knowledge can be accumulated, managed, and utilized in the As-Is patent system, which currently only manages standardized patent information.

Data-Driven Approach to Identify Research Topics for Science and Technology Diplomacy (과학외교를 위한 데이터기반의 연구주제선정 방법)

  • Yeo, Woon-Dong;Kim, Seonho;Lee, BangRae;Noh, Kyung-Ran
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.11
    • /
    • pp.216-227
    • /
    • 2020
  • In science and technology diplomacy, major countries actively utilize their capabilities in science and technology for public diplomacy, especially for promoting diplomatic relations with politically sensitive regions and countries. Recently, with an increase in the influence of science and technology on national development, interest in science and technology diplomacy has increased. So far, science and technology diplomacy has relied on experts to find research topics that are of common interest to both the countries. However, this method has various problems such as the bias arising from the subjective judgment of experts, the attribution of the halo effect to famous researchers, and the use of different criteria for different experts. This paper presents an objective data-based approach to identify and recommend research topics to support science and technology diplomacy without relying on the expert-based approach. The proposed approach is based on big data analysis that uses deep-learning techniques and bibliometric methods. The Scopus database is used to find proper topics for collaborative research between two countries. This approach has been used to support science and technology diplomacy between Korea and Hungary and has raised expectations of policy makers. This paper finally discusses aspects that should be focused on to improve the system in the future.

Developing a deep learning-based recommendation model using online reviews for predicting consumer preferences: Evidence from the restaurant industry (딥러닝 기반 온라인 리뷰를 활용한 추천 모델 개발: 레스토랑 산업을 중심으로)

  • Dongeon Kim;Dongsoo Jang;Jinzhe Yan;Jiaen Li
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.31-49
    • /
    • 2023
  • With the growth of the food-catering industry, consumer preferences and the number of dine-in restaurants are gradually increasing. Thus, personalized recommendation services are required to select a restaurant suitable for consumer preferences. Previous studies have used questionnaires and star-rating approaches, which do not effectively depict consumer preferences. Online reviews are the most essential sources of information in this regard. However, previous studies have aggregated online reviews into long documents, and traditional machine-learning methods have been applied to these to extract semantic representations; however, such approaches fail to consider the surrounding word or context. Therefore, this study proposes a novel review textual-based restaurant recommendation model (RT-RRM) that uses deep learning to effectively extract consumer preferences from online reviews. The proposed model concatenates consumer-restaurant interactions with the extracted high-level semantic representations and predicts consumer preferences accurately and effectively. Experiments on real-world datasets show that the proposed model exhibits excellent recommendation performance compared with several baseline models.

Building Hierarchical Knowledge Base of Research Interests and Learning Topics for Social Computing Support (소셜 컴퓨팅을 위한 연구·학습 주제의 계층적 지식기반 구축)

  • Kim, Seonho;Kim, Kang-Hoe;Yeo, Woondong
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.489-498
    • /
    • 2012
  • This paper consists of two parts: In the first part, we describe our work to build hierarchical knowledge base of digital library patron's research interests and learning topics in various scholarly areas through analyzing well classified Electronic Theses and Dissertations (ETDs) of NDLTD Union catalog. Journal articles from ACM Transactions and conference web sites of computing areas also are added in the analysis to specialize computing fields. This hierarchical knowledge base would be a useful tool for many social computing and information service applications, such as personalization, recommender system, text mining, technology opportunity mining, information visualization, and so on. In the second part, we compare four grouping algorithms to select best one for our data mining researches by testing each one with the hierarchical knowledge base we described in the first part. From these two studies, we intent to show traditional verification methods for social community miming researches, based on interviewing and answering questionnaires, which are expensive, slow, and privacy threatening, can be replaced with systematic, consistent, fast, and privacy protecting methods by using our suggested hierarchical knowledge base.

A Study on Utilization of Vision Transformer for CTR Prediction (CTR 예측을 위한 비전 트랜스포머 활용에 관한 연구)

  • Kim, Tae-Suk;Kim, Seokhun;Im, Kwang Hyuk
    • Knowledge Management Research
    • /
    • v.22 no.4
    • /
    • pp.27-40
    • /
    • 2021
  • Click-Through Rate (CTR) prediction is a key function that determines the ranking of candidate items in the recommendation system and recommends high-ranking items to reduce customer information overload and achieve profit maximization through sales promotion. The fields of natural language processing and image classification are achieving remarkable growth through the use of deep neural networks. Recently, a transformer model based on an attention mechanism, differentiated from the mainstream models in the fields of natural language processing and image classification, has been proposed to achieve state-of-the-art in this field. In this study, we present a method for improving the performance of a transformer model for CTR prediction. In order to analyze the effect of discrete and categorical CTR data characteristics different from natural language and image data on performance, experiments on embedding regularization and transformer normalization are performed. According to the experimental results, it was confirmed that the prediction performance of the transformer was significantly improved when the L2 generalization was applied in the embedding process for CTR data input processing and when batch normalization was applied instead of layer normalization, which is the default regularization method, to the transformer model.

A Generalized Adaptive Deep Latent Factor Recommendation Model (일반화 적응 심층 잠재요인 추천모형)

  • Kim, Jeongha;Lee, Jipyeong;Jang, Seonghyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.249-263
    • /
    • 2023
  • Collaborative Filtering, a representative recommendation system methodology, consists of two approaches: neighbor methods and latent factor models. Among these, the latent factor model using matrix factorization decomposes the user-item interaction matrix into two lower-dimensional rectangular matrices, predicting the item's rating through the product of these matrices. Due to the factor vectors inferred from rating patterns capturing user and item characteristics, this method is superior in scalability, accuracy, and flexibility compared to neighbor-based methods. However, it has a fundamental drawback: the need to reflect the diversity of preferences of different individuals for items with no ratings. This limitation leads to repetitive and inaccurate recommendations. The Adaptive Deep Latent Factor Model (ADLFM) was developed to address this issue. This model adaptively learns the preferences for each item by using the item description, which provides a detailed summary and explanation of the item. ADLFM takes in item description as input, calculates latent vectors of the user and item, and presents a method that can reflect personal diversity using an attention score. However, due to the requirement of a dataset that includes item descriptions, the domain that can apply ADLFM is limited, resulting in generalization limitations. This study proposes a Generalized Adaptive Deep Latent Factor Recommendation Model, G-ADLFRM, to improve the limitations of ADLFM. Firstly, we use item ID, commonly used in recommendation systems, as input instead of the item description. Additionally, we apply improved deep learning model structures such as Self-Attention, Multi-head Attention, and Multi-Conv1D. We conducted experiments on various datasets with input and model structure changes. The results showed that when only the input was changed, MAE increased slightly compared to ADLFM due to accompanying information loss, resulting in decreased recommendation performance. However, the average learning speed per epoch significantly improved as the amount of information to be processed decreased. When both the input and the model structure were changed, the best-performing Multi-Conv1d structure showed similar performance to ADLFM, sufficiently counteracting the information loss caused by the input change. We conclude that G-ADLFRM is a new, lightweight, and generalizable model that maintains the performance of the existing ADLFM while enabling fast learning and inference.

Extension Method of Association Rules Using Social Network Analysis (사회연결망 분석을 활용한 연관규칙 확장기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.111-126
    • /
    • 2017
  • Recommender systems based on association rule mining significantly contribute to seller's sales by reducing consumers' time to search for products that they want. Recommendations based on the frequency of transactions such as orders can effectively screen out the products that are statistically marketable among multiple products. A product with a high possibility of sales, however, can be omitted from the recommendation if it records insufficient number of transactions at the beginning of the sale. Products missing from the associated recommendations may lose the chance of exposure to consumers, which leads to a decline in the number of transactions. In turn, diminished transactions may create a vicious circle of lost opportunity to be recommended. Thus, initial sales are likely to remain stagnant for a certain period of time. Products that are susceptible to fashion or seasonality, such as clothing, may be greatly affected. This study was aimed at expanding association rules to include into the list of recommendations those products whose initial trading frequency of transactions is low despite the possibility of high sales. The particular purpose is to predict the strength of the direct connection of two unconnected items through the properties of the paths located between them. An association between two items revealed in transactions can be interpreted as the interaction between them, which can be expressed as a link in a social network whose nodes are items. The first step calculates the centralities of the nodes in the middle of the paths that indirectly connect the two nodes without direct connection. The next step identifies the number of the paths and the shortest among them. These extracts are used as independent variables in the regression analysis to predict future connection strength between the nodes. The strength of the connection between the two nodes of the model, which is defined by the number of nodes between the two nodes, is measured after a certain period of time. The regression analysis results confirm that the number of paths between the two products, the distance of the shortest path, and the number of neighboring items connected to the products are significantly related to their potential strength. This study used actual order transaction data collected for three months from February to April in 2016 from an online commerce company. To reduce the complexity of analytics as the scale of the network grows, the analysis was performed only on miscellaneous goods. Two consecutively purchased items were chosen from each customer's transactions to obtain a pair of antecedent and consequent, which secures a link needed for constituting a social network. The direction of the link was determined in the order in which the goods were purchased. Except for the last ten days of the data collection period, the social network of associated items was built for the extraction of independent variables. The model predicts the number of links to be connected in the next ten days from the explanatory variables. Of the 5,711 previously unconnected links, 611 were newly connected for the last ten days. Through experiments, the proposed model demonstrated excellent predictions. Of the 571 links that the proposed model predicts, 269 were confirmed to have been connected. This is 4.4 times more than the average of 61, which can be found without any prediction model. This study is expected to be useful regarding industries whose new products launch quickly with short life cycles, since their exposure time is critical. Also, it can be used to detect diseases that are rarely found in the early stages of medical treatment because of the low incidence of outbreaks. Since the complexity of the social networking analysis is sensitive to the number of nodes and links that make up the network, this study was conducted in a particular category of miscellaneous goods. Future research should consider that this condition may limit the opportunity to detect unexpected associations between products belonging to different categories of classification.