• 제목/요약/키워드: 희박성 문제

Search Result 70, Processing Time 0.027 seconds

Filtered Coupling Measures for Variable Selection in Sparse Vector Autoregressive Modeling (필터링된 잔차를 이용한 희박벡터자기회귀모형에서의 변수 선택 측도)

  • Lee, Seungkyu;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.5
    • /
    • pp.871-883
    • /
    • 2015
  • Vector autoregressive (VAR) models in high dimension suffer from noisy estimates, unstable predictions and hard interpretation. Consequently, the sparse vector autoregressive (sVAR) model, which forces many small coefficients in VAR to exactly zero, has been suggested and proven effective for the modeling of high dimensional time series data. This paper studies coupling measures to select non-zero coefficients in sVAR. The basic idea based on the simulation study reveals that removing the effect of other variables greatly improves the performance of coupling measures. sVAR model coefficients are asymmetric; therefore, asymmetric coupling measures such as Granger causality improve computational costs. We propose two asymmetric coupling measures, filtered-cross-correlation and filtered-Granger-causality, based on the filtered residuals series. Our proposed coupling measures are proven adequate for heavy-tailed and high order sVAR models in the simulation study.

A Movie Recommendation System using Individual Review and Meta Data (개인 리뷰를 이용한 영화추천 시스템)

  • Kim, Min-Jeong;Park, Doo-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1611-1614
    • /
    • 2015
  • 최근 많은 추천 시스템들이 연구 되고 있으며, 사용자들에게 의사결정을 도와주는 추천시스템에 대한 중요도가 급증하고 있다. 기존의 영화 추천시스템에서는 희박성의 문제가 제기된다. 본 논문에서는 이러한 문제를 보완하고자 사용자가 영화에 대해 남긴 리뷰로부터 영화키워드를 분석하고 분석된 키워드로부터 가중치를 활용한다. 즉 사용자들로부터 영화에 대한 리뷰를 수집하고 리뷰로부터 각 영화 키워드를 분석해 키워드별 가중치를 활용해 이를 기반으로 영화를 추천한다. 그 결과 사용자에게 만족할만한 정보를 제공해 효율성을 높이고, 영화에 대한 개인 리뷰를 반영한 영화추천 시스템을 설계 및 구현해 사용자에게 적절한 영화를 추천한다.

TeT: Distributed Tera-Scale Tensor Generator (분산 테라스케일 텐서 생성기)

  • Jeon, ByungSoo;Lee, JungWoo;Kang, U
    • Journal of KIISE
    • /
    • v.43 no.8
    • /
    • pp.910-918
    • /
    • 2016
  • A tensor is a multi-dimensional array that represents many data such as (user, user, time) in the social network system. A tensor generator is an important tool for multi-dimensional data mining research with various applications including simulation, multi-dimensional data modeling/understanding, and sampling/extrapolation. However, existing tensor generators cannot generate sparse tensors like real-world tensors that obey power law. In addition, they have limitations such as tensor sizes that can be processed and additional time required to upload generated tensor to distributed systems for further analysis. In this study, we propose TeT, a distributed tera-scale tensor generator to solve these problems. TeT generates sparse random tensor as well as sparse R-MAT and Kronecker tensor without any limitation on tensor sizes. In addition, a TeT-generated tensor is immediately ready for further tensor analysis on the same distributed system. The careful design of TeT facilitates nearly linear scalability on the number of machines.

Performance Improvement of a Movie Recommendation System using Genre-wise Collaborative Filtering (장르별 협업필터링을 이용한 영화 추천 시스템의 성능 향상)

  • Lee, Jae-Sik;Park, Seog-Du
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.4
    • /
    • pp.65-78
    • /
    • 2007
  • This paper proposes a new method of weighted template matching for machine-printed numeral recognition. The proposed weighted template matching, which emphasizes the feature of a pattern using adaptive Hamming distance on local feature areas, improves the recognition rate while template matching processes an input image as one global feature. Template matching is vulnerable to random noises that generate ragged outlines of a pattern when it is binarized. This paper offers a method of chain code trimming in order to remove ragged outlines. The method corrects specific chain codes within the chain codes of the inner and the outer contour of a pattern. The experiment compares confusion matrices of both the template matching and the proposed weighted template matching with chain code trimming. The result shows that the proposed method improves fairly the recognition rate of the machine-printed numerals.

  • PDF

A Personalized Movie Recommendation System Based On Personal Sentiment and Collaborative Filtering (개인의 감정과 협업필터링을 이용한 개인화 영화 추천 시스템)

  • Kim, Sun-Ho;Park, Doo-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1176-1178
    • /
    • 2013
  • 협업 필터링(Collaborative Filtering)이란 많은 사용자들로부터 얻은 기호정보(taste information)에 따라 사용자들의 관심사들을 자동적으로 예측하여, 아이템에 대한 목표 사용자의 선호도와 다른 사용자의 선호도를 비교 분석하여 목표 사용자가 좋아할 만한 아이템을 추천하는 기법이다. 그러나 협업 필터링 기법은 고객 정보와 평가 정보가 충분히 많아야 정확성이 높은 추천 결과가 나타난다. 본 논문에서는 영화를 한 번도 평가하지 않은 사용자들에게 영화를 추천 해주기 위한 즉, 협업 필터링의 희박성 문제(Sparsity Problem)를 해결하기 위한 한 가지 방법으로 개인의 감정 정보를 이용하여 문제를 해결하는 방법을 소개한다.

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

Performance Improvement of Collaborative Filtering System Using Associative User′s Clustering Analysis for the Recalculation of Preference and Representative Attribute-Neighborhood (선호도 재계산을 위한 연관 사용자 군집 분석과 Representative Attribute -Neighborhood를 이용한 협력적 필터링 시스템의 성능향상)

  • Jung, Kyung-Yong;Kim, Jin-Su;Kim, Tae-Yong;Lee, Jung-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.287-296
    • /
    • 2003
  • There has been much research focused on collaborative filtering technique in Recommender System. However, these studies have shown the First-Rater Problem and the Sparsity Problem. The main purpose of this Paper is to solve these Problems. In this Paper, we suggest the user's predicting preference method using Bayesian estimated value and the associative user clustering for the recalculation of preference. In addition to this method, to complement a shortcoming, which doesn't regard the attribution of item, we use Representative Attribute-Neighborhood method that is used for the prediction when we find the similar neighborhood through extracting the representative attribution, which most affect the preference. We improved the efficiency by using the associative user's clustering analysis in order to calculate the preference of specific item within the cluster item vector to the collaborative filtering algorithm. Besides, for the problem of the Sparsity and First-Rater, through using Association Rule Hypergraph Partitioning algorithm associative users are clustered according to the genre. New users are classified into one of these genres by Naive Bayes classifier. In addition, in order to get the similarity value between users belonged to the classified genre and new users, and this paper allows the different estimated value to item which user evaluated through Naive Bayes learning. As applying the preference granted the estimated value to Pearson correlation coefficient, it can make the higher accuracy because the errors that cause the missing value come less. We evaluate our method on a large collaborative filtering database of user rating and it significantly outperforms previous proposed method.

SOM Clustering Method based on RFM Analysis for Predicting Customer Purchase Pattern in u-Commerce (RFM 분석 기반 고객 구매 패턴을 예측을 위한 SOM 클러스터링 방법)

  • Cho, Young Sung;Moon, Song Chul;Ryu, Keun Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2013.07a
    • /
    • pp.185-187
    • /
    • 2013
  • 유비쿼터스 컴퓨팅이 생활의 일부가 되어가면서 정보의 양도 급속도로 늘어나고 있으며, 이로 인해 많은 데이터 속에서 정보를 찾아내는 기술이 부각되고 있다. 고객 기반의 협력적 필터링을 이용한 고객 선호도 예측 방법에서는 아이템에 대한 사용자의 선호도를 기반으로 이웃 선정 방법을 사용하므로 아이템에 대한 내용을 반영하지 못할 뿐만 아니라 희박성 문제를 해결하지 못하고 있다. 그리고 비슷한 선호도를 가진 일부 아이템의 정보를 바탕으로 하기 때문에 아이템의 속성은 무시하는 경향이 있다. 본 논문에서는 유비쿼터스 상거래에서 RFM(Recency, Frequency, Monetary) 분석 기반의 SOM을 이용한 군집방법을 제안한다. 제안 방법은 고객의 구매 데이터 기반의 유사한 속성의 데이터끼리의 클러스터링을 통해 보다 빠른 시간 내에 고객 성향에 맞는 추천이 가능한 구매 패턴 추출이 가능하다.

  • PDF

A Comprehensive Performance Evaluation in Collaborative Filtering (협업필터링에서 포괄적 성능평가 모델)

  • Yu, Seok-Jong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.4
    • /
    • pp.83-90
    • /
    • 2012
  • In e-commerce systems that deal with a large number of items, the function of personalized recommendation is essential. Collaborative filtering that is a successful recommendation algorithm, suffers from the sparsity, cold-start, and scalability restrictions. Additionally, this work raises a new flaw of the algorithm, inconsistent performance of recommendation. This is also not measurable by the current MAE-based evaluation that does not consider the deviation of prediction error, and furthermore is performed independently of precision and recall measurement. To evaluate the collaborative filtering comprehensively, this work proposes an extended evaluation model that includes the current criteria such as MAE, Precision, Recall, deviation, and applies it to cluster-based combined collaborative filtering.

Development of a Personalized Recommendation Procedure Based on Data Mining Techniques for Internet Shopping Malls (인터넷 쇼핑몰을 위한 데이터마이닝 기반 개인별 상품추천방법론의 개발)

  • Kim, Jae-Kyeong;Ahn, Do-Hyun;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.3
    • /
    • pp.177-191
    • /
    • 2003
  • Recommender systems are a personalized information filtering technology to help customers find the products they would like to purchase. Collaborative filtering is the most successful recommendation technology. Web usage mining and clustering analysis are widely used in the recommendation field. In this paper, we propose several hybrid collaborative filtering-based recommender procedures to address the effect of web usage mining and cluster analysis. Through the experiment with real e-commerce data, it is found that collaborative filtering using web log data can perform recommendation tasks effectively, but using cluster analysis can perform efficiently.

  • PDF