• Title/Summary/Keyword: data sparsity

Search Result 174, Processing Time 0.021 seconds

An Improved RSR Method to Obtain the Sparse Projection Matrix (희소 투영행렬 획득을 위한 RSR 개선 방법론)

  • Ahn, Jung-Ho
    • Journal of Digital Contents Society
    • /
    • v.16 no.4
    • /
    • pp.605-613
    • /
    • 2015
  • This paper addresses the problem to make sparse the projection matrix in pattern recognition method. Recently, the size of computer program is often restricted in embedded systems. It is very often that developed programs include some constant data. For example, many pattern recognition programs use the projection matrix for dimension reduction. To improve the recognition performance, very high dimensional feature vectors are often extracted. In this case, the projection matrix can be very big. Recently, RSR(roated sparse regression) method[1] was proposed. This method has been proved one of the best algorithm that obtains the sparse matrix. We propose three methods to improve the RSR; outlier removal, sampling and elastic net RSR(E-RSR) in which the penalty term in RSR optimization function is replaced by that of the elastic net regression. The experimental results show that the proposed methods are very effective and improve the sparsity rate dramatically without sacrificing the recognition rate compared to the original RSR method.

Centroidal Voronoi Tessellation-Based Reduced-Order Modeling of Navier-Stokes Equations

  • 이형천
    • Proceedings of the Korean Society of Computational and Applied Mathematics Conference
    • /
    • 2003.09a
    • /
    • pp.1-1
    • /
    • 2003
  • In this talk, a reduced-order modeling methodology based on centroidal Voronoi tessellations (CVT's)is introduced. CVT's are special Voronoi tessellations for which the generators of the Voronoi diagram are also the centers of mass (means) of the corresponding Voronoi cells. The discrete data sets, CVT's are closely related to the h-means clustering techniques. Even with the use of good mesh generators, discretization schemes, and solution algorithms, the computational simulation of complex, turbulent, or chaotic systems still remains a formidable endeavor. For example, typical finite element codes may require many thousands of degrees of freedom for the accurate simulation of fluid flows. The situation is even worse for optimization problems for which multiple solutions of the complex state system are usually required or in feedback control problems for which real-time solutions of the complex state system are needed. There hava been many studies devoted to the development, testing, and use of reduced-order models for complex systems such as unsteady fluid flows. The types of reduced-ordered models that we study are those attempt to determine accurate approximate solutions of a complex system using very few degrees of freedom. To do so, such models have to use basis functions that are in some way intimately connected to the problem being approximated. Once a very low-dimensional reduced basis has been determined, one can employ it to solve the complex system by applying, e.g., a Galerkin method. In general, reduced bases are globally supported so that the discrete systems are dense; however, if the reduced basis is of very low dimension, one does not care about the lack of sparsity in the discrete system. A discussion of reduced-ordering modeling for complex systems such as fluid flows is given to provide a context for the application of reduced-order bases. Then, detailed descriptions of CVT-based reduced-order bases and how they can be constructed of complex systems are given. Subsequently, some concrete incompressible flow examples are used to illustrate the construction and use of CVT-based reduced-order bases. The CVT-based reduced-order modeling methodology is shown to be effective for these examples and is also shown to be inexpensive to apply compared to other reduced-order methods.

  • PDF

Hyperspectral Image Classification via Joint Sparse representation of Multi-layer Superpixles

  • Sima, Haifeng;Mi, Aizhong;Han, Xue;Du, Shouheng;Wang, Zhiheng;Wang, Jianfang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.10
    • /
    • pp.5015-5038
    • /
    • 2018
  • In this paper, a novel spectral-spatial joint sparse representation algorithm for hyperspectral image classification is proposed based on multi-layer superpixels in various scales. Superpixels of various scales can provide complete yet redundant correlated information of the class attribute for test pixels. Therefore, we design a joint sparse model for a test pixel by sampling similar pixels from its corresponding superpixels combinations. Firstly, multi-layer superpixels are extracted on the false color image of the HSI data by principal components analysis model. Secondly, a group of discriminative sampling pixels are exploited as reconstruction matrix of test pixel which can be jointly represented by the structured dictionary and recovered sparse coefficients. Thirdly, the orthogonal matching pursuit strategy is employed for estimating sparse vector for the test pixel. In each iteration, the approximation can be computed from the dictionary and corresponding sparse vector. Finally, the class label of test pixel can be directly determined with minimum reconstruction error between the reconstruction matrix and its approximation. The advantages of this algorithm lie in the development of complete neighborhood and homogeneous pixels to share a common sparsity pattern, and it is able to achieve more flexible joint sparse coding of spectral-spatial information. Experimental results on three real hyperspectral datasets show that the proposed joint sparse model can achieve better performance than a series of excellent sparse classification methods and superpixels-based classification methods.

A Modified E-LEACH Routing Protocol for Improving the Lifetime of a Wireless Sensor Network

  • Abdurohman, Maman;Supriadi, Yadi;Fahmi, Fitra Zul
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.845-858
    • /
    • 2020
  • This paper proposes a modified end-to-end secure low energy adaptive clustering hierarchy (ME-LEACH) algorithm for enhancing the lifetime of a wireless sensor network (WSN). Energy limitations are a major constraint in WSNs, hence every activity in a WSN must efficiently utilize energy. Several protocols have been introduced to modulate the way a WSN sends and receives information. The end-to-end secure low energy adaptive clustering hierarchy (E-LEACH) protocol is a hierarchical routing protocol algorithm proposed to solve high-energy dissipation problems. Other methods that explore the presence of the most powerful nodes on each cluster as cluster heads (CHs) are the sparsity-aware energy efficient clustering (SEEC) protocol and an energy efficient clustering-based routing protocol that uses an enhanced cluster formation technique accompanied by the fuzzy logic (EERRCUF) method. However, each CH in the E-LEACH method sends data directly to the base station causing high energy consumption. SEEC uses a lot of energy to identify the most powerful sensor nodes, while EERRCUF spends high amounts of energy to determine the super cluster head (SCH). In the proposed method, a CH will search for the nearest CH and use it as the next hop. The formation of CH chains serves as a path to the base station. Experiments were conducted to determine the performance of the ME-LEACH algorithm. The results show that ME-LEACH has a more stable and higher throughput than SEEC and EERRCUF and has a 35.2% better network lifetime than the E-LEACH algorithm.

Collaborative Filtering Method Using Context of P2P Mobile Agents (P2P 모바일 에이전트의 컨텍스트 정보를 이용한 협력적 필터링 기법)

  • Lee Se-Il;Lee Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.5
    • /
    • pp.643-648
    • /
    • 2005
  • In order to supply services necessary for users intelligently in the ubiquitous computing, effective filtering of context information is necessary. But studies of context information filtering have not been made much yet. In order for filtering of context information, we can use collaborative filtering being used much at electric commerce, etc. In order to use such collaborative filtering method in the filtering of ubiquitous computing environment, we must solve such problems as first rater problem, sparsity problem, stored data problem and etc. In this study, in order to solve such problems, the researcher proposes the collaborative filtering method using types of context information. And as the result of applying this filtering method to MAUCA, the P2P mobile agent system, the researcher could confirm the average result of 7.7% in the aspect of service supporting function.

Adaptive Hyperspectral Image Classification Method Based on Spectral Scale Optimization

  • Zhou, Bing;Bingxuan, Li;He, Xuan;Liu, Hexiong
    • Current Optics and Photonics
    • /
    • v.5 no.3
    • /
    • pp.270-277
    • /
    • 2021
  • The adaptive sparse representation (ASR) can effectively combine the structure information of a sample dictionary and the sparsity of coding coefficients. This algorithm can effectively consider the correlation between training samples and convert between sparse representation-based classifier (SRC) and collaborative representation classification (CRC) under different training samples. Unlike SRC and CRC which use fixed norm constraints, ASR can adaptively adjust the constraints based on the correlation between different training samples, seeking a balance between l1 and l2 norm, greatly strengthening the robustness and adaptability of the classification algorithm. The correlation coefficients (CC) can better identify the pixels with strong correlation. Therefore, this article proposes a hyperspectral image classification method called correlation coefficients and adaptive sparse representation (CCASR), based on ASR and CC. This method is divided into three steps. In the first step, we determine the pixel to be measured and calculate the CC value between the pixel to be tested and various training samples. Then we represent the pixel using ASR and calculate the reconstruction error corresponding to each category. Finally, the target pixels are classified according to the reconstruction error and the CC value. In this article, a new hyperspectral image classification method is proposed by fusing CC and ASR. The method in this paper is verified through two sets of experimental data. In the hyperspectral image (Indian Pines), the overall accuracy of CCASR has reached 0.9596. In the hyperspectral images taken by HIS-300, the classification results show that the classification accuracy of the proposed method achieves 0.9354, which is better than other commonly used methods.

User Bias Drift Social Recommendation Algorithm based on Metric Learning

  • Zhao, Jianli;Li, Tingting;Yang, Shangcheng;Li, Hao;Chai, Baobao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3798-3814
    • /
    • 2022
  • Social recommendation algorithm can alleviate data sparsity and cold start problems in recommendation system by integrated social information. Among them, matrix-based decomposition algorithms are the most widely used and studied. Such algorithms use dot product operations to calculate the similarity between users and items, which ignores user's potential preferences, reduces algorithms' recommendation accuracy. This deficiency can be avoided by a metric learning-based social recommendation algorithm, which learns the distance between user embedding vectors and item embedding vectors instead of vector dot-product operations. However, previous works provide no theoretical explanation for its plausibility. Moreover, most works focus on the indirect impact of social friends on user's preferences, ignoring the direct impact on user's rating preferences, which is the influence of user rating preferences. To solve these problems, this study proposes a user bias drift social recommendation algorithm based on metric learning (BDML). The main work of this paper is as follows: (1) the process of introducing metric learning in the social recommendation scenario is introduced in the form of equations, and explained the reason why metric learning can replace the click operation; (2) a new user bias is constructed to simultaneously model the impact of social relationships on user's ratings preferences and user's preferences; Experimental results on two datasets show that the BDML algorithm proposed in this study has better recommendation accuracy compared with other comparison algorithms, and will be able to guarantee the recommendation effect in a more sparse dataset.

A New Similarity Measure for Categorical Attribute-Based Clustering (범주형 속성 기반 군집화를 위한 새로운 유사 측도)

  • Kim, Min;Jeon, Joo-Hyuk;Woo, Kyung-Gu;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.37 no.2
    • /
    • pp.71-81
    • /
    • 2010
  • The problem of finding clusters is widely used in numerous applications, such as pattern recognition, image analysis, market analysis. The important factors that decide cluster quality are the similarity measure and the number of attributes. Similarity measures should be defined with respect to the data types. Existing similarity measures are well applicable to numerical attribute values. However, those measures do not work well when the data is described by categorical attributes, that is, when no inherent similarity measure between values. In high dimensional spaces, conventional clustering algorithms tend to break down because of sparsity of data points. To overcome this difficulty, a subspace clustering approach has been proposed. It is based on the observation that different clusters may exist in different subspaces. In this paper, we propose a new similarity measure for clustering of high dimensional categorical data. The measure is defined based on the fact that a good clustering is one where each cluster should have certain information that can distinguish it with other clusters. We also try to capture on the attribute dependencies. This study is meaningful because there has been no method to use both of them. Experimental results on real datasets show clusters obtained by our proposed similarity measure are good enough with respect to clustering accuracy.

Collaborative Filtering using Co-Occurrence and Similarity information (상품 동시 발생 정보와 유사도 정보를 이용한 협업적 필터링)

  • Na, Kwang Tek;Lee, Ju Hong
    • Journal of Internet Computing and Services
    • /
    • v.18 no.3
    • /
    • pp.19-28
    • /
    • 2017
  • Collaborative filtering (CF) is a system that interprets the relationship between a user and a product and recommends the product to a specific user. The CF model is advantageous in that it can recommend products to users with only rating data without any additional information such as contents. However, there are many cases where a user does not give a rating even after consuming the product as well as consuming only a small portion of the total product. This means that the number of ratings observed is very small and the user rating matrix is very sparse. The sparsity of this rating data poses a problem in raising CF performance. In this paper, we concentrate on raising the performance of latent factor model (especially SVD). We propose a new model that includes product similarity information and co occurrence information in SVD. The similarity and concurrence information obtained from the rating data increased the expressiveness of the latent space in terms of latent factors. Thus, Recall increased by 16% and Precision and NDCG increased by 8% and 7%, respectively. The proposed method of the paper will show better performance than the existing method when combined with other recommender systems in the future.

The Adaptive Personalization Method According to Users Purchasing Index : Application to Beverage Purchasing Predictions (고객별 구매빈도에 동적으로 적응하는 개인화 시스템 : 음료수 구매 예측에의 적용)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.95-108
    • /
    • 2011
  • TThis is a study of the personalization method that intelligently adapts the level of clustering considering purchasing index of a customer. In the e-biz era, many companies gather customers' demographic and transactional information such as age, gender, purchasing date and product category. They use this information to predict customer's preferences or purchasing patterns so that they can provide more customized services to their customers. The previous Customer-Segmentation method provides customized services for each customer group. This method clusters a whole customer set into different groups based on their similarity and builds predictive models for the resulting groups. Thus, it can manage the number of predictive models and also provide more data for the customers who do not have enough data to build a good predictive model by using the data of other similar customers. However, this method often fails to provide highly personalized services to each customer, which is especially important to VIP customers. Furthermore, it clusters the customers who already have a considerable amount of data as well as the customers who only have small amount of data, which causes to increase computational cost unnecessarily without significant performance improvement. The other conventional method called 1-to-1 method provides more customized services than the Customer-Segmentation method for each individual customer since the predictive model are built using only the data for the individual customer. This method not only provides highly personalized services but also builds a relatively simple and less costly model that satisfies with each customer. However, the 1-to-1 method has a limitation that it does not produce a good predictive model when a customer has only a few numbers of data. In other words, if a customer has insufficient number of transactional data then the performance rate of this method deteriorate. In order to overcome the limitations of these two conventional methods, we suggested the new method called Intelligent Customer Segmentation method that provides adaptive personalized services according to the customer's purchasing index. The suggested method clusters customers according to their purchasing index, so that the prediction for the less purchasing customers are based on the data in more intensively clustered groups, and for the VIP customers, who already have a considerable amount of data, clustered to a much lesser extent or not clustered at all. The main idea of this method is that applying clustering technique when the number of transactional data of the target customer is less than the predefined criterion data size. In order to find this criterion number, we suggest the algorithm called sliding window correlation analysis in this study. The algorithm purposes to find the transactional data size that the performance of the 1-to-1 method is radically decreased due to the data sparity. After finding this criterion data size, we apply the conventional 1-to-1 method for the customers who have more data than the criterion and apply clustering technique who have less than this amount until they can use at least the predefined criterion amount of data for model building processes. We apply the two conventional methods and the newly suggested method to Neilsen's beverage purchasing data to predict the purchasing amounts of the customers and the purchasing categories. We use two data mining techniques (Support Vector Machine and Linear Regression) and two types of performance measures (MAE and RMSE) in order to predict two dependent variables as aforementioned. The results show that the suggested Intelligent Customer Segmentation method can outperform the conventional 1-to-1 method in many cases and produces the same level of performances compare with the Customer-Segmentation method spending much less computational cost.