• Title/Summary/Keyword: PARAFAC 분해

Search Result 5, Processing Time 0.018 seconds

S-PARAFAC: Distributed Tensor Decomposition using Apache Spark (S-PARAFAC: 아파치 스파크를 이용한 분산 텐서 분해)

  • Yang, Hye-Kyung;Yong, Hwan-Seung
    • Journal of KIISE
    • /
    • v.45 no.3
    • /
    • pp.280-287
    • /
    • 2018
  • Recently, the use of a recommendation system and tensor data analysis, which has high-dimensional data, is increasing, as they allow us to analyze the tensor and extract potential elements and patterns. However, due to the large size and complexity of the tensor, it needs to be decomposed in order to analyze the tensor data. While several tools are used for tensor decomposition such as rTensor, pyTensor, and MATLAB, since such tools run on a single machine, they are unable to handle large data. Also, while distributed tensor decomposition tools based on Hadoop can handle a scalable tensor, its computing speed is too slow. In this paper, we propose S-PARAFAC, which is a tensor decomposition tool based on Apache Spark, in distributed in-memory environments. We converted the PARAFAC algorithm into an Apache Spark version that enables rapid processing of tensor data. We also compared the performance of the Hadoop based tensor tool and S-PARAFAC. The result showed that S-PARAFAC is approximately 4~25 times faster than the Hadoop based tensor tool.

An Analysis of a Blogosphere using PARAFAC Decomposition (PARAFAC 분해를 이용한 블로그 공간 분석)

  • Kim, Ki-Nam;Kim, Sang-Wook;Kim, Jin-Woo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1253-1254
    • /
    • 2011
  • 본 논문에서는 블로그 공간을 텐서로 표현하고, 이를 분석한다. 분석 결과에 따르면, PARAFAC 분해를 통하여 특정 주제를 나타내는 커뮤니티들을 올바르게 파악할 수 있었으며, 각 커뮤니티에서 영향력 있는 블로그들과 키워드들, 그리고 권위 있는 포스트들을 식별할 수 있었다.

Nonnegative Tucker Decomposition (텐서의 비음수 Tucker 분해)

  • Kim, Yong-Deok;Choi, Seung-Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.3
    • /
    • pp.296-300
    • /
    • 2008
  • Nonnegative tensor factorization(NTF) is a recent multiway(multilineal) extension of nonnegative matrix factorization(NMF), where nonnegativity constraints are imposed on the CANDECOMP/PARAFAC model. In this paper we consider the Tucker model with nonnegativity constraints and develop a new tensor factorization method, referred to as nonnegative Tucker decomposition (NTD). We derive multiplicative updating algorithms for various discrepancy measures: least square error function, I-divergence, and $\alpha$-divergence.

PARAFAC Tensor Reconstruction for Recommender System based on Apache Spark (아파치 스파크에서의 PARAFAC 분해 기반 텐서 재구성을 이용한 추천 시스템)

  • Im, Eo-Jin;Yong, Hwan-Seung
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.4
    • /
    • pp.443-454
    • /
    • 2019
  • In recent years, there has been active research on a recommender system that considers three or more inputs in addition to users and goods, making it a multi-dimensional array, also known as a tensor. The main issue with using tensor is that there are a lot of missing values, making it sparse. In order to solve this, the tensor can be shrunk using the tensor decomposition algorithm into a lower dimensional array called a factor matrix. Then, the tensor is reconstructed by calculating factor matrices to fill original empty cells with predicted values. This is called tensor reconstruction. In this paper, we propose a user-based Top-K recommender system by normalized PARAFAC tensor reconstruction. This method involves factorization of a tensor into factor matrices and reconstructs the tensor again. Before decomposition, the original tensor is normalized based on each dimension to reduce overfitting. Using the real world dataset, this paper shows the processing of a large amount of data and implements a recommender system based on Apache Spark. In addition, this study has confirmed that the recommender performance is improved through normalization of the tensor.

A Study on the Dynamics of Dissolved Organic Matter Associated with Ambient Biophysicochemical Factors in the Sediment Control Dam (Lake Youngju) (영주댐 유사조절지 상류의 용존유기물 (Dissolved Organic Matter) 특성과 물리·화학 및 생물학적 환경 요인과의 연관성 연구)

  • Oh, Hye-Ji;Kim, Dokyun;Choi, Jisoo;Chae, Yeon-Ji;Oh, Jong Min;Shin, Kyung-Hoon;Choi, Kwangsoon;Kim, Dong-Kyun;Chang, Kwang-Hyeon
    • Korean Journal of Ecology and Environment
    • /
    • v.54 no.4
    • /
    • pp.346-362
    • /
    • 2021
  • A sediment control dam is an artificial structure built to prolong sedimentation in the main dam by reducing the inflow of suspended solids. These dams can affect changes in dissolved organic matter (DOM) in the water body by changing the river flow regime. The main DOM component for Yeongju Dam sediment control of the Naeseongcheon River was analyzed through 3D excitation-emission matrix (EEM) and parallel factor (PARAFAC) analyses. As a result, four humic-like components (C1~C3, C5), and three proteins, tryptophan-like components (C2, C6~C7) were detected. Among DOM components, humic-like components (autochthonous: C1, allochthonous: C2~C3) were found to be dominant during the sampling period. The total amount of DOM components and the composition ratio of each component did not show a difference for each depth according to the amount of available light (100%, 12%, and 1%). Throughout the study period, the allochthonous organic matter was continuously decomposing and converting into autochthonous organic matter; the DOM indices (fluorescence index, humification index, and freshness index) indicated the dominance of autochthonous organic matter in the river. Considering the relative abundance of cyanobacteria and that the number of bacteria cells and rotifers increased as autochthonous organic matter increased, it was suggested that the algal bloom and consequent activation of the microbial food web was affected by the composition of DOM in the water body. Research on DOM characteristics is important not only for water quality management but also for understanding the cycling of matter through microbial food web activity.