• Title/Summary/Keyword: sparsity

Search Result 333, Processing Time 0.032 seconds

Stagewise Weak Orthogonal Matching Pursuit Algorithm Based on Adaptive Weak Threshold and Arithmetic Mean

  • Zhao, Liquan;Ma, Ke
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1343-1358
    • /
    • 2020
  • In the stagewise arithmetic orthogonal matching pursuit algorithm, the weak threshold used in sparsity estimation is determined via maximum iterations. Different maximum iterations correspond to different thresholds and affect the performance of the algorithm. To solve this problem, we propose an improved variable weak threshold based on the stagewise arithmetic orthogonal matching pursuit algorithm. Our proposed algorithm uses the residual error value to control the weak threshold. When the residual value decreases, the threshold value continuously increases, so that the atoms contained in the atomic set are closer to the real sparsity value, making it possible to improve the reconstruction accuracy. In addition, we improved the generalized Jaccard coefficient in order to replace the inner product method that is used in the stagewise arithmetic orthogonal matching pursuit algorithm. Our proposed algorithm uses the covariance to replace the joint expectation for two variables based on the generalized Jaccard coefficient. The improved generalized Jaccard coefficient can be used to generate a more accurate calculation of the correlation between the measurement matrixes. In addition, the residual is more accurate, which can reduce the possibility of selecting the wrong atoms. We demonstrate using simulations that the proposed algorithm produces a better reconstruction result in the reconstruction of a one-dimensional signal and two-dimensional image signal.

A Stepwise Rating Prediction Method for Recommender Systems (추천 시스템을 위한 단계적 평가치 예측 방안)

  • Lee, Soojung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.4
    • /
    • pp.183-188
    • /
    • 2021
  • Collaborative filtering based recommender systems are currently indispensable function of commercial systems in various fields, being a useful service by providing customized products that users will prefer. However, there is a high possibility that the prediction of preferrable products is inaccurate, when the user's rating data are insufficient. In order to overcome this drawback, this study suggests a stepwise method for prediction of product ratings. If the application conditions of the prediction method corresponding to each step are not satisfied, the method of the next step is applied. To evaluate the performance of the proposed method, experiments using a public dataset are conducted. As a result, our method significantly improves prediction and precision performance of collaborative filtering systems employing various conventional similarity measures and outperforms performance of the previous methods for solving rating data sparsity.

MP-Lasso chart: a multi-level polar chart for visualizing group Lasso analysis of genomic data

  • Min Song;Minhyuk Lee;Taesung Park;Mira Park
    • Genomics & Informatics
    • /
    • v.20 no.4
    • /
    • pp.48.1-48.7
    • /
    • 2022
  • Penalized regression has been widely used in genome-wide association studies for joint analyses to find genetic associations. Among penalized regression models, the least absolute shrinkage and selection operator (Lasso) method effectively removes some coefficients from the model by shrinking them to zero. To handle group structures, such as genes and pathways, several modified Lasso penalties have been proposed, including group Lasso and sparse group Lasso. Group Lasso ensures sparsity at the level of pre-defined groups, eliminating unimportant groups. Sparse group Lasso performs group selection as in group Lasso, but also performs individual selection as in Lasso. While these sparse methods are useful in high-dimensional genetic studies, interpreting the results with many groups and coefficients is not straightforward. Lasso's results are often expressed as trace plots of regression coefficients. However, few studies have explored the systematic visualization of group information. In this study, we propose a multi-level polar Lasso (MP-Lasso) chart, which can effectively represent the results from group Lasso and sparse group Lasso analyses. An R package to draw MP-Lasso charts was developed. Through a real-world genetic data application, we demonstrated that our MP-Lasso chart package effectively visualizes the results of Lasso, group Lasso, and sparse group Lasso.

APMDI-CF: An Effective and Efficient Recommendation Algorithm for Online Users

  • Ya-Jun Leng;Zhi Wang;Dan Peng;Huan Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.3050-3063
    • /
    • 2023
  • Recommendation systems provide personalized products or services to online users by mining their past preferences. Collaborative filtering is a popular recommendation technique because it is easy to implement. However, with the rapid growth of the number of users in recommendation systems, collaborative filtering suffers from serious scalability and sparsity problems. To address these problems, a novel collaborative filtering recommendation algorithm is proposed. The proposed algorithm partitions the users using affinity propagation clustering, and searches for k nearest neighbors in the partition where active user belongs, which can reduce the range of searching and improve real-time performance. When predicting the ratings of active user's unrated items, mean deviation method is used to impute values for neighbors' missing ratings, thus the sparsity can be decreased and the recommendation quality can be ensured. Experiments based on two different datasets show that the proposed algorithm is excellent both in terms of real-time performance and recommendation quality.

Paper Recommendation Using SPECTER with Low-Rank and Sparse Matrix Factorization

  • Panpan Guo;Gang Zhou;Jicang Lu;Zhufeng Li;Taojie Zhu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1163-1185
    • /
    • 2024
  • With the sharp increase in the volume of literature data, researchers must spend considerable time and energy locating desired papers. A paper recommendation is the means necessary to solve this problem. Unfortunately, the large amount of data combined with sparsity makes personalizing papers challenging. Traditional matrix decomposition models have cold-start issues. Most overlook the importance of information and fail to consider the introduction of noise when using side information, resulting in unsatisfactory recommendations. This study proposes a paper recommendation method (PR-SLSMF) using document-level representation learning with citation-informed transformers (SPECTER) and low-rank and sparse matrix factorization; it uses SPECTER to learn paper content representation. The model calculates the similarity between papers and constructs a weighted heterogeneous information network (HIN), including citation and content similarity information. This method combines the LSMF method with HIN, effectively alleviating data sparsity and cold-start issues and avoiding topic drift. We validated the effectiveness of this method on two real datasets and the necessity of adding side information.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Recommender System using Implicit Trust-enhanced Collaborative Filtering (내재적 신뢰가 강화된 협업필터링을 이용한 추천시스템)

  • Kim, Kyoung-Jae;Kim, Youngtae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.1-10
    • /
    • 2013
  • Personalization aims to provide customized contents to each user by using the user's personal preferences. In this sense, the core parts of personalization are regarded as recommendation technologies, which can recommend the proper contents or products to each user according to his/her preference. Prior studies have proposed novel recommendation technologies because they recognized the importance of recommender systems. Among several recommendation technologies, collaborative filtering (CF) has been actively studied and applied in real-world applications. The CF, however, often suffers sparsity or scalability problems. Prior research also recognized the importance of these two problems and therefore proposed many solutions. Many prior studies, however, suffered from problems, such as requiring additional time and cost for solving the limitations by utilizing additional information from other sources besides the existing user-item matrix. This study proposes a novel implicit rating approach for collaborative filtering in order to mitigate the sparsity problem as well as to enhance the performance of recommender systems. In this study, we propose the methods of reducing the sparsity problem through supplementing the user-item matrix based on the implicit rating approach, which measures the trust level among users via the existing user-item matrix. This study provides the preliminary experimental results for testing the usefulness of the proposed model.

Location Inference of Twitter Users using Timeline Data (타임라인데이터를 이용한 트위터 사용자의 거주 지역 유추방법)

  • Kang, Ae Tti;Kang, Young Ok
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.69-81
    • /
    • 2015
  • If one can infer the residential area of SNS users by analyzing the SNS big data, it can be an alternative by replacing the spatial big data researches which result from the location sparsity and ecological error. In this study, we developed the way of utilizing the daily life activity pattern, which can be found from timeline data of tweet users, to infer the residential areas of tweet users. We recognized the daily life activity pattern of tweet users from user's movement pattern and the regional cognition words that users text in tweet. The models based on user's movement and text are named as the daily movement pattern model and the daily activity field model, respectively. And then we selected the variables which are going to be utilized in each model. We defined the dependent variables as 0, if the residential areas that users tweet mainly are their home location(HL) and as 1, vice versa. According to our results, performed by the discriminant analysis, the hit ratio of the two models was 67.5%, 57.5% respectively. We tested both models by using the timeline data of the stress-related tweets. As a result, we inferred the residential areas of 5,301 users out of 48,235 users and could obtain 9,606 stress-related tweets with residential area. The results shows about 44 times increase by comparing to the geo-tagged tweets counts. We think that the methodology we have used in this study can be used not only to secure more location data in the study of SNS big data, but also to link the SNS big data with regional statistics in order to analyze the regional phenomenon.

Acoustic Signal Classifier Design using Dictionary Learning (딕셔너리 러닝을 이용한 음파 신호 분류기 설계)

  • Park, Sung Min;Sah, Sung Jin;Oh, Kwang Myung;Lee, Hui Sung
    • Journal of Auto-vehicle Safety Association
    • /
    • v.8 no.1
    • /
    • pp.19-25
    • /
    • 2016
  • As new car technology is developing, temporal interaction is needed in automotive. Rhythmic pattern is one of the practical examples of temporal interaction in vehicle. To recognize rhythmic pattern and its input medium, dictionary learning is applicable algorithm. In this paper, performance and memory requirement of the learning algorithm is tested and is sufficiently good for use this acoustic sound.

Sparse kernel classication using IRWLS procedure

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.4
    • /
    • pp.749-755
    • /
    • 2009
  • Support vector classification (SVC) provides more complete description of the lin-ear and nonlinear relationships between input vectors and classifiers. In this paper. we propose the sparse kernel classifier to solve the optimization problem of classification with a modified hinge loss function and absolute loss function, which provides the efficient computation and the sparsity. We also introduce the generalized cross validation function to select the hyper-parameters which affects the classification performance of the proposed method. Experimental results are then presented which illustrate the performance of the proposed procedure for classification.

  • PDF