• Title/Summary/Keyword: 희박성

Search Result 241, Processing Time 0.027 seconds

Web Log Data Sparsity Analysis for OLAP (웹 로그 데이터의 OLAP 연산을 위한 희박성 분석)

  • 김지현;용환승
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10a
    • /
    • pp.58-60
    • /
    • 2001
  • 하루에도 수십 수백 메가 바이트까지 증가하는 웹 로그 데이터를 이용하여 실시간에 다차원분석을 가능하게 하기 위해서는 OLAP의 적용이 필요하다. 하지만 OLAP을 적용하는데 있어서 빠른 응답시간을 얻기 위해 사전처리(Precomputation)를 수행 할 시 심각한 데이터의 희박성으로 인해 데이터 폭발 현상이 발생된다. 본 논문에서는 실제 웹 로그 데이터를 사용하여 OLAP적용 시 희박성을 일으키는 원인들을 밝히고, 2, 3 차원에서의 희박성 형태를 분석함으로써 웹 로그 데이터의 희박성 처리 방식 및 성능평가에 기반이 되게 한다.

  • PDF

OLAP System and Performance Evaluation for Analyzing Web Log Data (웹 로그 분석을 위한 OLAP 시스템 및 성능 평가)

  • 김지현;용환승
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.5
    • /
    • pp.909-920
    • /
    • 2003
  • Nowadays, IT for CRM has been growing and developed rapidly. Typical techniques are statistical analysis tools, on-line multidimensional analytical processing (OLAP) tools, and data mining algorithms (such neural networks, decision trees, and association rules). Among customer data, web log data is very important and to use these data efficiently, applying OLAP technology to analyze multi-dimensionally. To make OLAP cube, we have to precalculate multidimensional summary results in order to get fast response. But as the number of dimensions and sparse cells increases, data explosion occurs seriously and the performance of OLAP decreases. In this paper, we presented why the web log data sparsity occurs and then what kinds of sparsity patterns generate in the two and t.he three dimensions for OLAP. Based on this research, we set up the multidimensional data models and query models for benchmark with each sparsity patterns. Finally, we evaluated the performance of three OLAP systems (MS SQL 2000 Analysis Service, Oracle Express and C-MOLAP).

  • PDF

Adaptive lasso in sparse vector autoregressive models (Adaptive lasso를 이용한 희박벡터자기회귀모형에서의 변수 선택)

  • Lee, Sl Gi;Baek, Changryong
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.1
    • /
    • pp.27-39
    • /
    • 2016
  • This paper considers variable selection in the sparse vector autoregressive (sVAR) model where sparsity comes from setting small coefficients to exact zeros. In the estimation perspective, Davis et al. (2015) showed that the lasso type of regularization method is successful because it provides a simultaneous variable selection and parameter estimation even for time series data. However, their simulations study reports that the regular lasso overestimates the number of non-zero coefficients, hence its finite sample performance needs improvements. In this article, we show that the adaptive lasso significantly improves the performance where the adaptive lasso finds the sparsity patterns superior to the regular lasso. Some tuning parameter selections in the adaptive lasso are also discussed from the simulations study.

Applying Centrality Analysis to Solve the Cold-Start and Sparsity Problems in Collaborative Filtering (협업필터링의 신규고객추천 및 희박성 문제 해결을 위한 중심성분석의 활용)

  • Cho, Yoon-Ho;Bang, Joung-Hae
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.99-114
    • /
    • 2011
  • Collaborative Filtering (CF) suffers from two major problems:sparsity and cold-start recommendation. This paper focuses on the cold-start problem for new customers with no purchase records and the sparsity problem for the customers with very few purchase records. For the purpose, we propose a method for the new customer recommendation by using a combined measure based on three well-used centrality measures to identify the customers who are most likely to become neighbors of the new customer. To alleviate the sparsity problem, we also propose a hybrid approach that applies our method to customers with very few purchase records and CF to the other customers with sufficient purchases. To evaluate the effectiveness of our method, we have conducted several experiments using a data set from a department store in Korea. The experiment results show that the combination of two measures makes better recommendations than not only a single measure but also the best-seller-based method and that the performance is improved when applying the hybrid approach.

Korea Electric Power Research Institute, Ewha Womans University (OLAP시스템에서 희박 데이터의 패턴 분류 및 성능 평가)

  • 강주영;이봉재;송재주;신진호;용환승
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.178-180
    • /
    • 2004
  • OLAP(On-Line Analytical Processing)은 데이터 웨어하우스 내의 방대한 양의 데이터에 대해 사용자와의 상호 작용이 가능하도록 질의에 대하여 빠른 응답성능을 보장해야 한다. 이를 위해 OLAP 시스템은 데이터에 대한 다량의 다차원 집계 연산을 수행해야 하기 때문에, 일반적으로 사전 연산 결과를 저장하여 직접적인 집계 연산을 줄임으로써 응답 성능을 놓이는 방법을 사용하고 있다 OLAP 다차원 데이터의 희박성은 이러한 사전 연산 시 데이터 폭발 현상을 일으켜 도리어 성능을 저하시키는 요인으로 작용할 수 있다. 본 논문에서는 데이터의 희박성과 성능 문제에 대해 고찰하고 OLAP 응용에서 발생할 수 있는 다차원 데이터의 희박성 패턴에 대해 정의하였다. 또한 정의된 패턴에 따라 희박 데이터를 생성하는 데이터 생성기를 구현하고 이를 이용하여 생성된 데이터를 기반으로 MS SQL Server Analysis Services와 Pilot DSS의 두 OLAP 제품의 성능을 평가하고 결과를 비교하였다.

  • PDF

Illumination Estimation Based on Nonnegative Matrix Factorization with Dominant Chromaticity Analysis (주색도 분석을 적용한 비음수 행렬 분해 기반의 광원 추정)

  • Lee, Ji-Heon;Kim, Dae-Chul;Ha, Yeong-Ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.8
    • /
    • pp.89-96
    • /
    • 2015
  • Human visual system has chromatic adaptation to determine the color of an object regardless of illumination, whereas digital camera records illumination and reflectance together, giving the color appearance of the scene varied under different illumination. NMFsc(nonnegative matrix factorization with sparseness constraint) was recently introduced to estimate original object color by using sparseness constraint. In NMFsc, low sparseness constraint is used to estimate illumination and high sparseness constraint is used to estimate reflectance. However, NMFsc has an illumination estimation error for images with large uniform area, which is considered as dominant chromaticity. To overcome the defects of NMFsc, illumination estimation via nonnegative matrix factorization with dominant chromaticity image is proposed. First, image is converted to chromaticity color space and analyzed by chromaticity histogram. Chromaticity histogram segments the original image into similar chromaticity images. A segmented region with the lowest standard deviation is determined as dominant chromaticity region. Next, dominant chromaticity is removed in the original image. Then, illumination estimation using nonnegative matrix factorization is performed on the image without dominant chromaticity. To evaluate the proposed method, experimental results are analyzed by average angular error in the real world dataset and it has shown that the proposed method with 5.5 average angular error achieve better illuminant estimation over the previous method with 5.7 average angular error.

Sparse Design Problem in Local Linear Quasi-likelihood Estimator (국소선형 준가능도 추정량의 자료 희박성 문제 해결방안)

  • Park, Dong-Ryeon
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.133-145
    • /
    • 2007
  • Local linear estimator has a number of advantages over the traditional kernel estimators. The better performance near boundaries is one of them. However, local linear estimator can produce erratic result in sparse regions in the realization of the design and to solve this problem much research has been done. Local linear quasi-likelihood estimator has many common properties with local linear estimator, and it turns out that sparse design can also lead local linear quasi-likelihood estimator to erratic behavior in practice. Several methods to solve this problem are proposed and their finite sample properties are compared by the simulation study.

A Movie Recommendation Method Using Rating Difference Between Items (항목 간 선호도 차이를 이용한 영화 추천 방법)

  • Oh, Se-Chang;Choi, Min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.11
    • /
    • pp.2602-2608
    • /
    • 2013
  • User-based and item-based method have been developed as the solutions of the movie recommendation problem. However, these methods are faced with the sparsity problem and the problem of not reflecting user's rating respectively. In order to solve these problems, there is a research on the combination of the two methods using the concept of similarity. In reality, it is not free from the problem of sparsity, since it has a lot of parameters to be calculated. In this study, we propose a recommendation method using rating difference between items in order to complement this problem. This method is relatively free from the problem of sparsity, since it has less parameters to be calculated. And it can get more accurate results by reflecting the users rating to calculate the parameters. In experiments for the proposed method, the initial error is large, but the performance has been quickly stabilized after. In addition, it showed a 0.0538 lower average error compared to the existing method using similarity.

Improved Movie Recommendation System based-on Personal Propensity and Collaborative Filtering (개인성향과 협업 필터링을 이용한 개선된 영화 추천 시스템)

  • Park, Doo-Soon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.11
    • /
    • pp.475-482
    • /
    • 2013
  • Several approaches to recommendation systems have been studied. One of the most successful technologies for building personalization and recommendation systems is collaborative filtering, which is a technique that provides a process of filtering customer information based on such information profiles. Collaborative filtering systems, however, have a sparsity if there is not enough data to recommend. In this paper, we suggest a movie recommendation system, based on the weighted personal propensity and the collaborating filtering system, in order to provide a solution to such sparsity. Furthermore, we assess the system's applicability by using the open database MovieLens, and present a weighted personal propensity framework for improvement in the performance of recommender systems. We successfully come up with a movie recommendation system through the optimal personalization factors.

Two-step Clustring Method Using Time Schema for Performance Improvement in Recommender System (시간스키마 기법 2단계 클러스터링 적용 추천시스템의 성능 향상)

  • Kim Ryong;Bu Jong-Su;Hong Jong-Kyu;Park Won-Ik;Kim Young-Kuk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.205-207
    • /
    • 2005
  • 기존의 추천 시스템들은 사용자 수가 증가함에 따라 추천시간이 증가하는 확장성(Scalability) 문제가 있으며, 새로운 고객의 경우 선호도 정보가 부족하여 추천 정확도가 저하되는 희박성(Saparsity) 문제가 있다. 본 논문에서는 고객의 기본 프로파일 정보 중 가장 변별력이 있는 성과 나이에 대한 그룹을 생성하고 클러스터링 함으로써 집단 내 선호 상품을 우선적으로 추천하는 1단계 클러스터링 방법을 사용하여 새로운 고객의 희박성 문제를 해결 했으며, 추천결과에 따른 피드백을 받아 시간 흐름에 따른 선호 경향을 클러스터링 하는 시간스키마 방법을 적용한 2단계 클러스터링 방법을 사용함으로써 확장성 문제를 해결함은 물론 예측 정확도를 높일 수 있는 방법을 제안한다.

  • PDF