• Title/Summary/Keyword: Data sparsity

Search Result 172, Processing Time 0.033 seconds

A Stepwise Rating Prediction Method for Recommender Systems (추천 시스템을 위한 단계적 평가치 예측 방안)

  • Lee, Soojung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.4
    • /
    • pp.183-188
    • /
    • 2021
  • Collaborative filtering based recommender systems are currently indispensable function of commercial systems in various fields, being a useful service by providing customized products that users will prefer. However, there is a high possibility that the prediction of preferrable products is inaccurate, when the user's rating data are insufficient. In order to overcome this drawback, this study suggests a stepwise method for prediction of product ratings. If the application conditions of the prediction method corresponding to each step are not satisfied, the method of the next step is applied. To evaluate the performance of the proposed method, experiments using a public dataset are conducted. As a result, our method significantly improves prediction and precision performance of collaborative filtering systems employing various conventional similarity measures and outperforms performance of the previous methods for solving rating data sparsity.

A Signal Detection and Estimation Method Based on Compressive Sensing (압축 센싱 기반의 신호 검출 및 추정 방법)

  • Nguyen, Thu L.N.;Jung, Honggyu;Shin, Yoan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.6
    • /
    • pp.1024-1031
    • /
    • 2015
  • Compressive sensing is a new data acquisition method enabling the reconstruction of sparse or compressible signals from a smaller number of measurements than Nyquist rate, as long as the signal is sparse and the measurement is incoherent. In this paper, we consider a simple hypothesis testing in target detection and estimation problems using compressive sensing, where the performance depends on the sparsity level of the signals being detected. We provide theoretical analysis results along with some experiment results.

A Study on Measurement Selection Algorithm for Power System State Estimation Under the Consideration of Dummy Buses (DUMMY모선을 고려한 상태추정 측정점선정 알고리즘에 관한 연구)

  • 문영현;이태식
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.41 no.2
    • /
    • pp.107-117
    • /
    • 1992
  • This paper presents an improved algorithm of optimal measurement design with a reliability evaluation method for a large power system. The proposed algorithm is developed to consider the dummy bus and to achieve highest accuracy of the state estimator as well with the limited investment cost. The dummy bus in the power system is impossible to install measurement meter, while real and reactive power measurements is considered in the proposed algorithm. On the other hand, P/C model is developed by taking advantage of the matrix sparsity. The improved program is successfully tested for KEPCO system with PSS/E lineflow calculated data package.

  • PDF

A Fuzzy-based Inference Model for Web of Trust Using User Behavior Information in Social Network (사회네트워크에서 사용자 행위정보를 활용한 퍼지 기반의 신뢰관계망 추론 모형)

  • Song, Hee-Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.17 no.4
    • /
    • pp.39-56
    • /
    • 2010
  • We are sometimes interacting with people who we know nothing and facing with the difficult task of making decisions involving risk in social network. To reduce risk, the topic of building Web of trust is receiving considerable attention in social network. The easiest approach to build Web of trust will be to ask users to represent level of trust explicitly toward another users. However, there exists sparsity issue in Web of trust which is represented explicitly by users as well as it is difficult to urge users to express their level of trustworthiness. We propose a fuzzy-based inference model for Web of trust using user behavior information in social network. According to the experiment result which is applied in Epinions.com, the proposed model show improved connectivity in resulting Web of trust as well as reduced prediction error of trustworthiness compared to existing computational model.

  • PDF

사회네트워크에서 잠재된 신뢰관계망 추론을 위한 ANFIS 모형

  • Song, Hui-Seok
    • Proceedings of the Korea Database Society Conference
    • /
    • 2010.06a
    • /
    • pp.277-287
    • /
    • 2010
  • We are sometimes interacting with people who we know nothing and facing with the difficult task of making decisions involving risk in social network. To reduce risk, the topic of building Web of trust is receiving considerable attention in social network. The easiest approach to build Web of trust will be to ask users to represent level of trust explicitly toward another users. However, there exists sparsity issue in Web of trust which is represented explicitly by users as well as it is difficult to urge users to express their level of trustworthiness. We propose a fuzzy-based inference model for Web of trust using user behavior information in social network. According to the experiment result which is applied in Epinions.com, the proposed model show improved connectivity in resulting Web of trust as well as reduced prediction error of trustworthiness compared to existing computational model.

  • PDF

Recommendation System using 2-Way Hybrid Collaborative Filtering in E-Business (전자상거래에서 2-Way 혼합 협력적 필터링을 이용한 추천 시스템)

  • 김용집;정경용;이정현
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.175-178
    • /
    • 2003
  • Two defects have been pointed out in existing user-based collaborative filtering such as sparsity and scalability, and the research has been also made progress, which tries to improve these defects using item-based collaborative filtering. Actually there were many results, but the problem of sparsity still remains because of being based on an explicit data. In addition, the issue has been pointed out. which attributes of item arenot reflected in the recommendation. This paper suggests a recommendation method using nave Bayesian algorithm in hybrid user and item-based collaborative filtering to improve above-mentioned defects of existing item-based collaborative filtering. This method generates a similarity table for each user and item, then it improves the accuracy of prediction and recommendation item using naive Bayesianalgorithm. It was compared and evaluated with existing item-based collaborative filtering technique to estimate the accuracy.

  • PDF

User-Item Matrix Reduction Technique for Personalized Recommender Systems (개인화 된 추천시스템을 위한 사용자-상품 매트릭스 축약기법)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Information Technology Applications and Management
    • /
    • v.16 no.1
    • /
    • pp.97-113
    • /
    • 2009
  • Collaborative filtering(CF) has been a very successful approach for building recommender system, but its widespread use has exposed to some well-known problems including sparsity and scalability problems. In order to mitigate these problems, we propose two novel models for improving the typical CF algorithm, whose names are ISCF(Item-Selected CF) and USCF(User-Selected CF). The modified models of the conventional CF method that condense the original dataset by reducing a dimension of items or users in the user-item matrix may improve the prediction accuracy as well as the efficiency of the conventional CF algorithm. As a tool to optimize the reduction of a user-item matrix, our study proposes genetic algorithms. We believe that our approach may relieve the sparsity and scalability problems. To validate the applicability of ISCF and USCF, we applied them to the MovieLens dataset. Experimental results showed that both the efficiency and the accuracy were enhanced in our proposed models.

  • PDF

Location Inference of Twitter Users using Timeline Data (타임라인데이터를 이용한 트위터 사용자의 거주 지역 유추방법)

  • Kang, Ae Tti;Kang, Young Ok
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.69-81
    • /
    • 2015
  • If one can infer the residential area of SNS users by analyzing the SNS big data, it can be an alternative by replacing the spatial big data researches which result from the location sparsity and ecological error. In this study, we developed the way of utilizing the daily life activity pattern, which can be found from timeline data of tweet users, to infer the residential areas of tweet users. We recognized the daily life activity pattern of tweet users from user's movement pattern and the regional cognition words that users text in tweet. The models based on user's movement and text are named as the daily movement pattern model and the daily activity field model, respectively. And then we selected the variables which are going to be utilized in each model. We defined the dependent variables as 0, if the residential areas that users tweet mainly are their home location(HL) and as 1, vice versa. According to our results, performed by the discriminant analysis, the hit ratio of the two models was 67.5%, 57.5% respectively. We tested both models by using the timeline data of the stress-related tweets. As a result, we inferred the residential areas of 5,301 users out of 48,235 users and could obtain 9,606 stress-related tweets with residential area. The results shows about 44 times increase by comparing to the geo-tagged tweets counts. We think that the methodology we have used in this study can be used not only to secure more location data in the study of SNS big data, but also to link the SNS big data with regional statistics in order to analyze the regional phenomenon.

Comparison of Lasso Type Estimators for High-Dimensional Data

  • Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.4
    • /
    • pp.349-361
    • /
    • 2014
  • This paper compares of lasso type estimators in various high-dimensional data situations with sparse parameters. Lasso, adaptive lasso, fused lasso and elastic net as lasso type estimators and ridge estimator are compared via simulation in linear models with correlated and uncorrelated covariates and binary regression models with correlated covariates and discrete covariates. Each method is shown to have advantages with different penalty conditions according to sparsity patterns of regression parameters. We applied the lasso type methods to Arabidopsis microarray gene expression data to find the strongly significant genes to distinguish two groups.

Dense Sub-Cube Extraction Algorithm for a Multidimensional Large Sparse Data Cube (다차원 대용량 저밀도 데이타 큐브에 대한 고밀도 서브 큐브 추출 알고리즘)

  • Lee Seok-Lyong;Chun Seok-Ju;Chung Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.33 no.4
    • /
    • pp.353-362
    • /
    • 2006
  • A data warehouse is a data repository that enables users to store large volume of data and to analyze it effectively. In this research, we investigate an algorithm to establish a multidimensional data cube which is a powerful analysis tool for the contents of data warehouses and databases. There exists an inevitable retrieval overhead in a multidimensional data cube due to the sparsity of the cube. In this paper, we propose a dense sub-cube extraction algorithm that identifies dense regions from a large sparse data cube and constructs the sub-cubes based on the dense regions found. It reduces the retrieval overhead remarkably by retrieving those small dense sub-cubes instead of scanning a large sparse cube. The algorithm utilizes the bitmap and histogram based techniques to extract dense sub-cubes from the data cube, and its effectiveness is demonstrated via an experiment.