• Title/Summary/Keyword: Feature statistics

Search Result 256, Processing Time 0.021 seconds

Feature analysis and ranking prediction of music suspected of being abused (사재기 의혹 음원 특징 분석과 순위 예측)

  • Cheong, Hae Rin;Kim, Do Young;Jeong, Hyeon Jeong;Kim, Seong Gyeong;Kim, Hyeon Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.388-391
    • /
    • 2022
  • 온라인 음원 스트리밍 서비스가 확대되면서 음원 사재기가 빈번해지고 있다. 본 논문에서는 사재기로 의심할 수 있는 음원의 특징을 분석하고, 사재기가 이루어지지 않았을 경우의 음원 순위를 예측한다. 그 결과, 랜덤 포레스트를 통해 앨범 평점이 낮은 음원, 장르가 인디나 발라드인 음원, 특정 발매사의 음원일 때 사재기로 의심할 수 있었다. 또한, 딥러닝을 통한 순위 예측 실험 결과, 사재기의 영향으로 실제 순위와 예측 순위에 큰 차이가 있는 것으로 나타났다.

Noisy Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation

  • Chung, Yong-Joo
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.2
    • /
    • pp.37-41
    • /
    • 2014
  • The vector Taylor series (VTS) based method usually employs clean speech Hidden Markov Models (HMMs) when compensating speech feature vectors or adapting the parameters of trained HMMs. It is well-known that noisy speech HMMs trained by the Multi-condition TRaining (MTR) and the Multi-Model-based Speech Recognition framework (MMSR) method perform better than the clean speech HMM in noisy speech recognition. In this paper, we propose a method to use the noise-adapted HMMs in the VTS-based speech feature compensation method. We derived a novel mathematical relation between the train and the test noisy speech feature vector in the log-spectrum domain and the VTS is used to estimate the statistics of the test noisy speech. An iterative EM algorithm is used to estimate train noisy speech from the test noisy speech along with noise parameters. The proposed method was applied to the noise-adapted HMMs trained by the MTR and MMSR and could reduce the relative word error rate significantly in the noisy speech recognition experiments on the Aurora 2 database.

Category Factor Based Feature Selection for Document Classification

  • Kang Yun-Hee
    • International Journal of Contents
    • /
    • v.1 no.2
    • /
    • pp.26-30
    • /
    • 2005
  • According to the fast growth of information on the Internet, it is becoming increasingly difficult to find and organize useful information. To reduce information overload, it needs to exploit automatic text classification for handling enormous documents. Support Vector Machine (SVM) is a model that is calculated as a weighted sum of kernel function outputs. This paper describes a document classifier for web documents in the fields of Information Technology and uses SVM to learn a model, which is constructed from the training sets and its representative terms. The basic idea is to exploit the representative terms meaning distribution in coherent thematic texts of each category by simple statistics methods. Vector-space model is applied to represent documents in the categories by using feature selection scheme based on TFiDF. We apply a category factor which represents effects in category of any term to the feature selection. Experiments show the results of categorization and the correlation of vector length.

  • PDF

Long Memory Characteristics in the Korean Stock Market Volatility

  • Cho, Sinsup;Choe, Hyuk;Park, Joon Y
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.3
    • /
    • pp.577-594
    • /
    • 2002
  • For the estimation and test of long memory feature in volatilities of stock indices and individual companies semiparametric approach, Geweke and Porter-Hudak (1983), is employed. Empirical study supports the strong evidence of volatility persistence in Korean stock market. Most of indices and individual companies have the feature of long term dependence of volatility. Hence the short memory models are unable to explain the volatilities in Korean stock market.

Arrow Diagrams for Kernel Principal Component Analysis

  • Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.175-184
    • /
    • 2013
  • Kernel principal component analysis(PCA) maps observations in nonlinear feature space to a reduced dimensional plane of principal components. We do not need to specify the feature space explicitly because the procedure uses the kernel trick. In this paper, we propose a graphical scheme to represent variables in the kernel principal component analysis. In addition, we propose an index for individual variables to measure the importance in the principal component plane.

Visualizing SVM Classification in Reduced Dimensions

  • Huh, Myung-Hoe;Park, Hee-Man
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.5
    • /
    • pp.881-889
    • /
    • 2009
  • Support vector machines(SVMs) are known as flexible and efficient classifier of multivariate observations, producing a hyperplane or hyperdimensional curved surface in multidimensional feature space that best separates training samples by known groups. As various methodological extensions are made for SVM classifiers in recent years, it becomes more difficult to understand the constructed model intuitively. The aim of this paper is to visualize various SVM classifications tuned by several parameters in reduced dimensions, so that data analysts secure the tangible image of the products that the machine made.

A Feature Selection Method Based on Fuzzy Cluster Analysis (퍼지 클러스터 분석 기반 특징 선택 방법)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.14B no.2
    • /
    • pp.135-140
    • /
    • 2007
  • Feature selection is a preprocessing technique commonly used on high dimensional data. Feature selection studies how to select a subset or list of attributes that are used to construct models describing data. Feature selection methods attempt to explore data's intrinsic properties by employing statistics or information theory. The recent developments have involved approaches like correlation method, dimensionality reduction and mutual information technique. This feature selection have become the focus of much research in areas of applications with massive and complex data sets. In this paper, we provide a feature selection method considering data characteristics and generalization capability. It provides a computational approach for feature selection based on fuzzy cluster analysis of its attribute values and its performance measures. And we apply it to the system for classifying computer virus and compared with heuristic method using the contrast concept. Experimental result shows the proposed approach can give a feature ranking, select the features, and improve the system performance.

Protein Motif Extraction via Feature Interval Selection

  • Sohn, In-Suk;Hwang, Chang-Ha;Ko, Jun-Su;Chiu, David;Hong, Dug-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1279-1287
    • /
    • 2006
  • The purpose of this paper is to present a new algorithm for extracting the consensus pattern, or motif from sequence belonging to the same family. Two methods are considered for feature interval partitioning based on equal probability and equal width interval partitioning. C2H2 zinc finger protein and epidermal growth factor protein sequences are used to demonstrate the effectiveness of the proposed algorithm for motif extraction. For two protein families, the equal width interval partitioning method performs better than the equal probability interval partitioning method.

  • PDF

An Intelligent Iris Recognition System (지능형 홍채 인식 시스템)

  • Kim, Jae-Min;Cho, Seong-Won;Kim, Soo-Lin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.468-472
    • /
    • 2004
  • This paper presents an intelligent iris recognition system which consists of quality check, iris localization, feature extraction, and verification. For the quality check, the local statistics on the pupil boundary is used. Gaussian mixture model is used to segment and localized the iris region. The feature extraction method is based on an optimal waveform simplification. For the verification, we use an intelligent variable threshold.

Fault Feature Clarification in the Residual for Fault Detection and Diagnosis of Control Systems

  • Lee, Jonghyo;Joon Lyou
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.96.3-96
    • /
    • 2002
  • A scheme of clarifying fault feature in the residual is given for model-based fault detection and diagnosis of control systems. It is based on the residual generation using a robust filter and the noise suppresion in test statistics of the residual by multi-scale discrete wavelet transform. By clarifying the fault feature in the residual, the difficulties of existing model based approaches via adopting a threshold can be overcomed and it has advantage of taking the false alarm and missed detection into acount at the same time, which can make the fault detection and diagnosis easy and correct. To show the effectiveness of our approach, the simulation results are illustrated for a linear syste...

  • PDF