• Title/Summary/Keyword: Mel

Search Result 586, Processing Time 0.042 seconds

Comparison & Analysis of Speech/Music Discrimination Features through Experiments (실험에 의한 음성·음악 분류 특징의 비교 분석)

  • Lee, Kyung-Rok;Ryu, Shi-Woo;Gwark, Jae-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.11a
    • /
    • pp.308-313
    • /
    • 2004
  • In this paper, we compared and analyzed the discrimination performance of speech/music about combinations of each features parameter. Audio signals are classified into 3 classes (speech, music, speech and music). On three types of features, Mel-cepstrum, energy, zero-crossings used to the experiments. Then compared and analyzed the best of the combinations between features to speech/ music discrimination performance. The best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%).

  • PDF

Parts-based Feature Extraction of Speech Spectrum Using Non-Negative Matrix Factorization (Non-Negative Matrix Factorization을 이용한 음성 스펙트럼의 부분 특징 추출)

  • 박정원;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.49-52
    • /
    • 2003
  • In this paper, we propose new speech feature parameter using NMf(Non-Negative Matrix Factorization). NMF can represent multi-dimensional data based on effective dimensional reduction through matrix factorization under the non-negativity constraint, and reduced data present parts-based features of input data. In this paper, we verify about usefulness of NMF algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment result, we could confirm that proposal feature parameter is superior in recognition performance than MFCC(mel frequency cepstral coefficient) that is used generally.

  • PDF

Synthesis and in vitro Cytotoxicity Monoterpenoid as New Antitumor Agents (Monoterpenoid계의 새로운 항암제 합성 및 In vitro 세포독성 평가)

  • 이민정;김대근;백형근;이강노;정규혁
    • Biomolecules & Therapeutics
    • /
    • v.9 no.3
    • /
    • pp.143-155
    • /
    • 2001
  • Many attention has been focused on developing new chemotherapeutic agents for a treatment of cancer from natural products. From Carpesium divaricatum S. et Z. (Compositae), various monoterpenoid compounds were isolated and exhibited mild antitumor activity against human tumor cell lines. These facts prompted us to explore the structure-activity relationship of these compounds. The synthesis of monoterpenoid compound was accomplished by Fries rearrangement, Grignard reaction, elimination, allylic oxidation, esterification and epoxidation as key steps. The results of in vitro cytotoxicity (A549, SK-OV-3, SK-MEL-2, XF498, HCT15) of the synthesised compounds are as follows: First of all, epoxide moiety is prerequisite for cytotoxic activity in diester compound. Any kind of compounds with olefin or diol moiety instead of epoxide ring exhibited poor or mild cytotoxic activity respectively. Of o-acetoxy and isobutoxy epoxy esters, p-sub-stituted phenylacetate compounds exhibited high cytotoxic activities against SK-MEL-2 and HCT15.

  • PDF

Classification of pathological and normal voice based on dimension reduction of feature vectors (피처벡터 축소방법에 기반한 장애음성 분류)

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.123-126
    • /
    • 2007
  • This paper suggests a method to improve the performance of the pathological/normal voice classification. The effectiveness of the mel frequency-based filter bank energies using the fisher discriminant ratio (FDR) is analyzed. And mel frequency cepstrum coefficients (MFCCs) and the feature vectors through the linear discriminant analysis (LDA) transformation of the filter bank energies (FBE) are implemented. This paper shows that the FBE LDA-based GMM is more distinct method for the pathological/normal voice classification than the MFCC-based GMM.

  • PDF

A study on the algorithm for speech recognition (음성인식을 위한 알고리즘에 관한 연구)

  • Kim, Sun-Chul;Lee, Jung-Woo;Cho, Kyu-Ok;Park, Jae-Gyun;Oh, Yong Taek
    • Proceedings of the KIEE Conference
    • /
    • 2008.07a
    • /
    • pp.2255-2256
    • /
    • 2008
  • 음성인식 시스템을 설계함에 있어서는 대표적으로 사람의 성도 특성을 모방한 LPC(Linear Predict Cording)방식과 청각 특성을 고려한 MFCC(Mel-Frequency Cepstral Coefficients)방식이 있다. 본 논문에서는 MFCC를 통해 특징파라미터를 추출하고 해당 영역에서의 수행된 작업을 매틀랩 알고리즘을 이용하여 그래프로 시현하였다. MFCC 방식의 추출과정은 최초의 음성신호로부터 전처리과정을 통해 아날로그 신호를 디지털 신호로 변환하고, 잡음부분을 최소화하며, 음성 부분을 강조한다. 이 신호는 다시 Windowing을 통해 음성의 불연속을 제거해 주고, FFT를 통해 시간의 영역을 주파수의 영역으로 변환한다. 이 변환된 신호는 Filter Bank를 거쳐 다수의 복잡한 신호를 몇 개의 간단한 신호로 간소화 할 수 있으며, 마지막으로 Mel-cepstrum을 통해 최종적으로 특징 파라미터를 얻고자 하였다.

  • PDF

A Study on the Human Auditory Scaling (인간의 청각 척도에 관한 고찰)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.125-134
    • /
    • 1997
  • Human beings can perceive various aspects of sound including loudness, pitch, length, and timber. Recently many studies were conducted to clarify complex auditory scales of the human ear. This study critically reviews some of these scales (decibel, sone, phon for loudness perception; mel and bark for pitch) and proposes to apply the scales to normalize acoustic correlates of human speech. One of the most important aspects of human auditory perception is the nonlinearity which should be incorporated into the linear speech analysis and synthesis system. Further studies using more sophisticated equipment are desirable to refine these scales, through the analysis of human auditory perception of complex tones or speech. This will lead scientists to develop better speech recognition and synthesis devices.

  • PDF

New Cytotoxic Sulfated Saponins from the Starfish Certonardoa semiregularis

  • Wang Wei Hong;Jang Hyo Jin;Hong Jong Ki;Lee Chong Ok;Bae Song Ja;Shin Sook;Jung Jee H.
    • Archives of Pharmacal Research
    • /
    • v.28 no.3
    • /
    • pp.285-289
    • /
    • 2005
  • Two new sulfated saponins designated as certonardosides P$_{2}$ and I$_{3}$ (1 and 2) were isolated from the brine shrimp active fraction of the MeOH extract of the starfish Certonardoa semiregularis. The structures were determined on the basis of spectral analysis. Compounds 1 and 2 were tested for cytotoxicity against five human tumor cell lines (A549, SK-OV-3, SK-MEL-2, XF498, and HCT15), and compound 1 displayed significant cytotoxicity against the SK-MEL-2 skin cancer cell.

Classification of Phornographic Videos Based on the Audio Information (오디오 신호에 기반한 음란 동영상 판별)

  • Kim, Bong-Wan;Choi, Dae-Lim;Lee, Yong-Ju
    • MALSORI
    • /
    • no.63
    • /
    • pp.139-151
    • /
    • 2007
  • As the Internet becomes prevalent in our lives, harmful contents, such as phornographic videos, have been increasing on the Internet, which has become a very serious problem. To prevent such an event, there are many filtering systems mainly based on the keyword-or image-based methods. The main purpose of this paper is to devise a system that classifies pornographic videos based on the audio information. We use the mel-cepstrum modulation energy (MCME) which is a modulation energy calculated on the time trajectory of the mel-frequency cepstral coefficients (MFCC) as well as the MFCC as the feature vector. For the classifier, we use the well-known Gaussian mixture model (GMM). The experimental results showed that the proposed system effectively classified 98.3% of pornographic data and 99.8% of non-pornographic data. We expect the proposed method can be applied to the more accurate classification system which uses both video and audio information.

  • PDF

Two-step a priori SNR Estimation in the Log-mel Domain Considering Phase Information (위상 정보를 고려한 로그멜 영역에서의 2단계 선험 SNR 추정)

  • Lee, Yun-Kyung;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.87-94
    • /
    • 2011
  • The decision directed (DD) approach is widely used to determine a priori SNR from noisy speech signals. In conventional speech enhancement systems with a DD approach, a priori SNR is estimated by using only the magnitude components and consequently follows a posteriori SNR with one frame delay. We propose a phase-dependent two-step a priori SNR estimator based on the minimum mean square error (MMSE) in the log-mel spectral domain so that we can consider both magnitude and phase information, and it can overcome the performance degradation caused by one frame delay. From the experimental results, the proposed estimator is shown to improve the output SNR of enhanced speech signals by 2.3 dB compared to the conventional DD approach-based system.

  • PDF

Correcting 3D camera tracking data for video composition (정교한 매치무비를 위한 3D 카메라 트래킹 기법에 관한 연구)

  • Lee, Jun-Sang;Lee, Imgeun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2012.07a
    • /
    • pp.105-106
    • /
    • 2012
  • 일반적으로 CG 합성이라 하면 '자연스러운' 것을 잘된 CG영상이라고 한다. 이 때 촬영된 영상이 정지화면 일 수 만은 없다. 카메라가 움직이는 영상에서는 CG합성도 실사카메라 무빙에 맞게 정확한 정합이 되어야 자연스러운 영상이 된다. 이를 위해 합성단계에서 작업할 때 3D 카메라 트래킹 기술이 필요하다. 카메라트래킹은 촬영된 실사영상만으로 카메라의 3차원 이동정보와 광학적 파라미터 등 촬영시의 3차원 공간을 복원하는 과정을 포함하고 있다. 이 과정에서 카메라 트래킹에 대한 오류의 발생으로 실사와 CG의 합성에 대한 생산성에 많은 문제점을 가지고 있다. 본 논문에서는 이러한 문제를 해결하기 위하여 소프트웨어에서 트래킹데이터를 보정하는 방법을 제안한다.

  • PDF