• Title/Summary/Keyword: 비음수 행렬 분해

Search Result 81, Processing Time 0.029 seconds

Document Summarization using Term Weighting (용어 가중치에 의한 문서요약)

  • Park, Sun;Kim, Chul Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.704-706
    • /
    • 2012
  • In this paper, we proposes a document summarization method using the term weighting. The proposed method can minimize the user intervention to use the pseudo relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature.

  • PDF

Topic-Based Multi-Document Summarization using Semantic Features of Documents (문서의 의미특징을 이용한 주제 기반의 다중문서 요약)

  • Park, Sun;An, Dong Un;Kim, Chul-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.715-716
    • /
    • 2009
  • 인터넷의 발전은 대량의 정보를 양산하였고, 이러한 대량의 정보 집합 내에서는 비슷한 정보가 재활용 되거나 반복되는 정보중복문제를 가지고 있다. 중복되는 정보들로부터 사용자에게 원하는 정보를 신속히 검색할 수 있도록 하는 정보 요약에 대한 필요성은 점차 증가하고 있다. 본 논문은 비음수 행렬 인수분해(NMF, non-negative matrix factorization)에 의한 문서의 의미특징을 이용하여 주제기반의 다중문서를 요약하는 새로운 방법을 제안한다. 본 논문에서는 다중문서가 포함하고 있는 문서들 간의 고유구조를 문서요약에 이용하여서 요약의 질을 높일 수 있고, 주제와 문장 간의 유사성과 다양성 고려하여서 쉽게 과잉정보를 제거하여 문장을 요약할 수 있는 장점을 갖는다.

Enhancing Document Clustering Method using Synonym of Cluster Topic and Similarity (군집 주제의 유의어와 유사도를 이용한 문서군집 향상 방법)

  • Park, Sun;Kim, Chul-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1538-1541
    • /
    • 2011
  • 본 논문은 군집 주제의 유의어와 유사도를 이용하여 문서군집의 성능을 향상시키는 방법을 제안한다. 제안된 방법은 비음수행렬분해의 의미특징을 이용하여 군집 주제(topic)의 용어들을 선택함으로서 문서 군집 집합의 내부구조를 잘 표현할 수 있으며, 군집 주제의 용어들에 워드넷의 유의어를 사용하여서 확장함으로써 문서를 용어집합(bag-of-words)으로 표현하는 문제를 해결할 수 있다. 또한 확장된 군집 주제의 용어와 문서집합에 코사인 유사도를 이용하여서 군집의 주제에 적합한 문서를 잘 군집하여서 성능을 높일 수 있다. 실험결과 제안방법을 적용한 문서군집방법이 다른 문서군집 방법에 비하여 좋은 성능을 보인다.

A NMF-Based Speech Enhancement Method Using a Prior Time Varying Information and Gain Function (시간 변화에 따른 사전 정보와 이득 함수를 적용한 NMF 기반 음성 향상 기법)

  • Kwon, Kisoo;Jin, Yu Gwang;Bae, Soo Hyun;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.6
    • /
    • pp.503-511
    • /
    • 2013
  • This paper presents a speech enhancement method using non-negative matrix factorization. In training phase, we can obtain each basis matrix from speech and specific noise database. After training phase, the noisy signal is separated from the speech and noise estimate using basis matrix in enhancement phase. In order to improve the performance, we model the change of encoding matrix from training phase to enhancement phase using independent Gaussian distribution models, and then use the constraint of the objective function almost same as that of the above Gaussian models. Also, we perform a smoothing operation to the encoding matrix by taking into account previous value. Last, we apply the Log-Spectral Amplitude type algorithm as gain function.

Target Speaker Speech Restoration via Spectral bases Learning (주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원)

  • Park, Sun-Ho;Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.3
    • /
    • pp.179-186
    • /
    • 2009
  • This paper proposes a target speech extraction which restores speech signal of a target speaker form noisy convolutive mixture of speech and an interference source. We assume that the target speaker is known and his/her utterances are available in the training time. Incorporating the additional information extracted from the training utterances into the separation, we combine convolutive blind source separation(CBSS) and non-negative decomposition techniques, e.g., probabilistic latent variable model. The nonnegative decomposition is used to learn a set of bases from the spectrogram of the training utterances, where the bases represent the spectral information corresponding to the target speaker. Based on the learned spectral bases, our method provides two postprocessing steps for CBSS. Channel selection step finds a desirable output channel from CBSS, which dominantly contains the target speech. Reconstruct step recovers the original spectrogram of the target speech from the selected output channel so that the remained interference source and background noise are suppressed. Experimental results show that our method substantially improves the separation results of CBSS and, as a result, successfully recovers the target speech.

Query-Based Text Summarization Using Cosine Similarity and NMF (NMF 와 코사인유사도를 이용한 질의 기반 문서요약)

  • Park Sun;Lee Ju-Hong;Ahn Chan-Min;Park Tae-Su;Song Jae-Won;Kim Deok-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.05a
    • /
    • pp.473-476
    • /
    • 2006
  • 인터넷의 발달로 인하여 정보의 양은 시간이 지날수록 폭발적으로 증가하고 있다. 이러한 방대한 정보로부터 정보검색시스템은 사용자에게 너무 많은 검색결과를 제시하여 사용자가 원하는 정보를 찾기 위해 너무 많은 시간을 소요하게 하는 정보의 과적재 문제가 있다. 질의 기반의 문서요약은 정보의 사용자가 원하는 정보의 검색시간을 줄임으로써 정보의 과적재 문제를 해결하는 방법으로서 점차 중요성이 증가하고 있다. 본 논문은 비음수 행렬 인수분해 (NMF, Non-negative Matrix Factorization)과 코사인 유사도를 이용하여 질의 기반의 문서를 요약하는 새로운 방법을 제안하였다. 제안된 방법은 질의와 문서 간에 사전학습이 필요 없다. 또한 문서를 그래프로 변형시키는 복잡한 처리 없이 NMF 에 의해 얻어진 의미 특징(semantic feature)과 의미 변수(semantic variable)로 문서의 고유 구조를 반영하여 요약의 정확도를 높일 수 있다. 마지막으로 단순한 방법으로 문장을 쉽게 요약할 수 있다.

  • PDF

Document Summarization using Pseudo Relevance Feedback and Term Weighting (의사연관피드백과 용어 가중치에 의한 문서요약)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.3
    • /
    • pp.533-540
    • /
    • 2012
  • In this paper, we propose a document summarization method using the pseudo relevance feedback and the term weighting based on semantic features. The proposed method can minimize the user intervention to use the pseudo relevance feedback. It also can improve the quality of document summaries because the inherent semantic of the sentence set are well reflected by term weighting derived from semantic feature. In addition, it uses the semantic feature of term weighting and the expanded query to reduce the semantic gap between the user's requirement and the result of proposed method. The experimental results demonstrate that the proposed method achieves better performant than other methods without term weighting.

Enhancing Snippet Extraction Method using Fuzzy and Semantic Features (퍼지와 의미특징을 이용한 스니핏 추출 향상 방법)

  • Park, Sun;Lee, Yeonwoo;Cho, Kwangmoon;Yang, Huyeol;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.11
    • /
    • pp.2374-2381
    • /
    • 2012
  • This paper proposes a new enhancing snippet extraction method using fuzzy and semantic features. The proposed method creates a delegate of sentence by using semantic features. It extracts snippet using fuzzy association between a delegate sentence and sentence set which well represents query. In addition, the method uses pseudo relevance feedback to expand query which extracts snippet to be well reflected semantic user's intention. The experimental results demonstrate the proposed method can achieve better snippet extraction performance than the previous methods.

A Hybrid Collaborative Filtering Using a Low-dimensional Linear Model (저차원 선형 모델을 이용한 하이브리드 협력적 여과)

  • Ko, Su-Jeong
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.10
    • /
    • pp.777-785
    • /
    • 2009
  • Collaborative filtering is a technique used to predict whether a particular user will like a particular item. User-based or item-based collaborative techniques have been used extensively in many commercial recommender systems. In this paper, a hybrid collaborative filtering method that combines user-based and item-based methods using a low-dimensional linear model is proposed. The proposed method solves the problems of sparsity and a large database by using NMF among the low-dimensional linear models. In collaborative filtering systems the methods using the NMF are useful in expressing users as semantic relations. However, they are model-based methods and the process of computation is complex, so they can not recommend items dynamically. In order to complement the shortcomings, the proposed method clusters users into groups by using NMF and selects features of groups by using TF-IDF. Mutual information is then used to compute similarities between items. The proposed method clusters users into groups and extracts features of groups on offline and determines the most suitable group for an active user using the features of groups on online. Finally, the proposed method reduces the time required to classify an active user into a group and outperforms previous methods by combining user-based and item-based collaborative filtering methods.

Audio signal clustering and separation using a stacked autoencoder (복층 자기부호화기를 이용한 음향 신호 군집화 및 분리)

  • Jang, Gil-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.4
    • /
    • pp.303-309
    • /
    • 2016
  • This paper proposes a novel approach to the problem of audio signal clustering using a stacked autoencoder. The proposed stacked autoencoder learns an efficient representation for the input signal, enables clustering constituent signals with similar characteristics, and therefore the original sources can be separated based on the clustering results. STFT (Short-Time Fourier Transform) is performed to extract time-frequency spectrum, and rectangular windows at all the possible locations are used as input values to the autoencoder. The outputs at the middle, encoding layer, are used to cluster the rectangular windows and the original sources are separated by the Wiener filters derived from the clustering results. Source separation experiments were carried out in comparison to the conventional NMF (Non-negative Matrix Factorization), and the estimated sources by the proposed method well represent the characteristics of the orignal sources as shown in the time-frequency representation.