• Title/Summary/Keyword: MEL

Search Result 576, Processing Time 0.021 seconds

A Comparison of Speech/Music Discrimination Features for Audio Indexing (오디오 인덱싱을 위한 음성/음악 분류 특징 비교)

  • 이경록;서봉수;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.10-15
    • /
    • 2001
  • In this paper, we describe the comparison between the combination of features using a speech and music discrimination, which is classifying between speech and music on audio signals. Audio signals are classified into 3classes (speech, music, speech and music) and 2classes (speech, music). Experiments carried out on three types of feature, Mel-cepstrum, energy, zero-crossings, and try to find a best combination between features to speech and music discrimination. We using a Gaussian Mixture Model (GMM) for discrimination algorithm and combine different features into a single vector prior to modeling the data with a GMM. In 3classes, the best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%). In 2classes, the best result is achieved using Mel-cepstrum, energy and Mel-cepstrum, energy, zero-crossings in a single feature vector (speech: 98.9%, music: 100%).

  • PDF

Robust Speech Parameters for the Emotional Speech Recognition (감정 음성 인식을 위한 강인한 음성 파라메터)

  • Lee, Guehyun;Kim, Weon-Goo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.681-686
    • /
    • 2012
  • This paper studied the speech parameters less affected by the human emotion for the development of the robust emotional speech recognition system. For this purpose, the effect of emotion on the speech recognition system and robust speech parameters of speech recognition system were studied using speech database containing various emotions. In this study, mel-cepstral coefficient, delta-cepstral coefficient, RASTA mel-cepstral coefficient, root-cepstral coefficient, PLP coefficient and frequency warped mel-cepstral coefficient in the vocal tract length normalization method were used as feature parameters. And CMS (Cepstral Mean Subtraction) and SBR(Signal Bias Removal) method were used as a signal bias removal technique. Experimental results showed that the HMM based speaker independent word recognizer using frequency warped RASTA mel-cepstral coefficient in the vocal tract length normalized method, its derivatives and CMS as a signal bias removal showed the best performance.

A Study on Emotion Recognition of Chunk-Based Time Series Speech (청크 기반 시계열 음성의 감정 인식 연구)

  • Hyun-Sam Shin;Jun-Ki Hong;Sung-Chan Hong
    • Journal of Internet Computing and Services
    • /
    • v.24 no.2
    • /
    • pp.11-18
    • /
    • 2023
  • Recently, in the field of Speech Emotion Recognition (SER), many studies have been conducted to improve accuracy using voice features and modeling. In addition to modeling studies to improve the accuracy of existing voice emotion recognition, various studies using voice features are being conducted. This paper, voice files are separated by time interval in a time series method, focusing on the fact that voice emotions are related to time flow. After voice file separation, we propose a model for classifying emotions of speech data by extracting speech features Mel, Chroma, zero-crossing rate (ZCR), root mean square (RMS), and mel-frequency cepstrum coefficients (MFCC) and applying them to a recurrent neural network model used for sequential data processing. As proposed method, voice features were extracted from all files using 'librosa' library and applied to neural network models. The experimental method compared and analyzed the performance of models of recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU) using the Interactive emotional dyadic motion capture Interactive Emotional Dyadic Motion Capture (IEMOCAP) english dataset.

Ginsenosides from the fruits of Panax ginseng and their cytotoxic effects on human cancer cell lines (인삼(Panax ginseng) 열매로부터 분리한 ginsenoside의 동정 및 암세포독성 효과)

  • Gwag, Jung Eun;Lee, Yeong-Geun;Hwang-Bo, Jeon;Kim, Hyoung-Geun;Oh, Seon Min;Lee, Dae Young;Baek, Nam-In
    • Journal of Applied Biological Chemistry
    • /
    • v.61 no.4
    • /
    • pp.371-377
    • /
    • 2018
  • The fruits of Panax ginseng were extracted with 80% aqueous MeOH and the concentrates were partitioned into EtOAc, n-BuOH, and $H_2O$ fractions. The repeated $SiO_2$ and octadecyl $SiO_2$ column chromatographies for the EtOAc fraction led to isolation of five ginsenosides. The chemical structures of these compounds were determined as ginsenoside F1 (1), ginsenoside F2 (2), ginsenoside F3 (3), ginsenoside Ia (4), notoginsenoside Fe (5) based on spectroscopic analyses including nuclear magnetic resonance, MS, and infrared. Compounds 2-5 were isolated for the first time from the fruits of P. ginseng in this study. All isolated compounds were evaluated for cytotoxic activities against human cancer cell lines such as HCT-116, SK-OV-3, human cervix adenocarcinoma (HeLa), HepG2, and SK-MEL-5. Among them compounds 2, 4, and 5 showed significant cytotoxicity on cancer cells. Compound 2 exhibited cytotoxicity on SK-MEL-5, HepG2, and HeLa cells with $IC_{50}$ values of 82.8, 86.8, and $78.3{\mu}M$, respectively. Compound 4 showed cytotoxicity on HCT-116, SK-MEL-5, SK-OV-3, HepG2, and HeLa cells with $IC_{50}$ values of 24.5, 25.4, 26.3, 22.0, and $24.9{\mu}M$, respectively. Compound 5 did on SK-MEL-5 cell with $IC_{50}$ value of $81.7{\mu}M$. The cytotoxicity of ginsenoside 2, 4, and 5 isolated from the fruits of Panax ginseng showed strong inhibition effect against on cancer cells, all of which have a glucopyranosyl moiety on C-3.

A New Feature for Speech Segments Extraction with Hidden Markov Models (숨은마코프모형을 이용하는 음성구간 추출을 위한 특징벡터)

  • Hong, Jeong-Woo;Oh, Chang-Hyuck
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.293-302
    • /
    • 2008
  • In this paper we propose a new feature, average power, for speech segments extraction with hidden Markov models, which is based on mel frequencies of speech signals. The average power is compared with the mel frequency cepstral coefficients, MFCC, and the power coefficient. To compare performances of three types of features, speech data are collected for words with explosives which are generally known hard to be detected. Experiments show that the average power is more accurate and efficient than MFCC and the power coefficient for speech segments extraction in environments with various levels of noise.

A Study on Connected Digits Recognition Using the K-L Expansion (K-L 전개를 이용한 연속 숫자음 인식에 관한 연구)

  • 김주곤;오세진;황철준;김범국;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.3
    • /
    • pp.24-31
    • /
    • 2001
  • The K-L expansion is a method for compressing dimensions of features and thus reduces computational cost in recognition process. Also This is well known that features can be extracted without much loss of information in the statistical pattern recognition. In this paper, the method that effectively applies K-L(Karhunen-Loeve) expansion to feature parameters of speech is proposed to improve the recognition accuracy of the Korean speech recognition system. The recognition performance of a novel feature parameters obtained by the proposed method(K-L coefficients) is compared with those of conventional Mel-cepstrum and regressive coefficients through speaker independent connected digits recognition experiments. Experimental results showed that average recognition rates using the K-L coefficients with regression coefficients obtained higher accuracy than conventional Mel-cepstrum with their regression coefficients.

  • PDF

A Mobile Emission Laboratory for Car Chasing Experiment (차량 추적을 위한 이동형 자동차 배출가스 측정시스템(MEL) 구축)

  • Lee, Seok-Hwan;Kim, Hong-Seok;Lee, Seung-Jae;Bae, Gwi-Nam
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.19 no.1
    • /
    • pp.109-116
    • /
    • 2011
  • To measure the traffic pollutants with high temporal and spatial resolution under real conditions a mobile emission laboratory (MEL) was designed and built in KIST with close-cooperation with KIMM and Yonsei university. The equipment of the mini-van provides gas phase measurements of CO, NOx, $CO_2$, THC (Total hydrocarbon) and number density & size distribution measurements of fine and ultra-fine particles by a fast mobility particle sizer (FMPS) and a condensation particle counter (CPC). The inlet sampling port above the bumper enables the chasing of different type of vehicles. This paper introduces the construction and technical details of the MEL and presents data from the car chasing experiment of diesel and CNG city bus. The dilution ratio was increased rapidly according to the chasing distance. Most particles from the diesel city bus were counted under 300 nm and the peak concentration of the particles was located between 40-60 nm. However, the most particles from the CNG city bus were nano particle counted under 50 nm.

A Study on Stable Motion Control of Humanoid Robot with 24 Joints Based on Voice Command

  • Lee, Woo-Song;Kim, Min-Seong;Bae, Ho-Young;Jung, Yang-Keun;Jung, Young-Hwa;Shin, Gi-Soo;Park, In-Man;Han, Sung-Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.21 no.1
    • /
    • pp.17-27
    • /
    • 2018
  • We propose a new approach to control a biped robot motion based on iterative learning of voice command for the implementation of smart factory. The real-time processing of speech signal is very important for high-speed and precise automatic voice recognition technology. Recently, voice recognition is being used for intelligent robot control, artificial life, wireless communication and IoT application. In order to extract valuable information from the speech signal, make decisions on the process, and obtain results, the data needs to be manipulated and analyzed. Basic method used for extracting the features of the voice signal is to find the Mel frequency cepstral coefficients. Mel-frequency cepstral coefficients are the coefficients that collectively represent the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. The reliability of voice command to control of the biped robot's motion is illustrated by computer simulation and experiment for biped walking robot with 24 joint.

Characterization of Recombinant Derivatives of pJY711 of Multicopy Streptomyces Plasmid (Multicopy Streptomyces 플라스미드 pJY711의 재조합 유도체의 특성)

  • 염도영;공인수;유주현
    • Korean Journal of Microbiology
    • /
    • v.28 no.1
    • /
    • pp.35-40
    • /
    • 1990
  • The restriction clevage map of multi-copy recombinant plasmid, pJY712(8.1kb), carrying the thiostrepton resistance gene(tsr) was determined. pJY712 had a broad host range in Streptomyces and contained single BglII site for cloning purpose. The plasmid showed the phenomenon of lethal zygosis ($Ltz^{+}$). Transformation frequency of pJY712 was $5.0\times 10^{4}$ transformants per ug plasmid DNA (TFU) in S. lividans. Plasmid pJY713 was constructed by inserting the tyrosinase gene(mel) into the BclI site of pJY712. Recombinant plasmid pJY714 carrying the mel gene was constructed by in vitro deletion of a segment (1.9kb BglII-BclI fragment) from pJY713.

  • PDF

Cytotoxic Activity of Leguminous Seed Extracts against Human Tumor Cell Lines

  • Lee, Hoi-Seon;Lee, Jeong-Ock;Lee, Hee-Kwon;Oh, Jong-Hwan;Ahn, Young-Joon
    • Applied Biological Chemistry
    • /
    • v.41 no.4
    • /
    • pp.246-250
    • /
    • 1998
  • The cytotoxic activity of methanol extracts of 25 leguminous seeds in vitro was evaluated by sulforhodamine B assay, using the five human solid A549 lung, SK-OV-2 ovarian, SK-MEL-2 melanoma, XF-498 CNS and HCT-15 colon tumor cell lines. The responses varied with both cell line arid leguminous seed used. Extracts of Canavalia lineata and Glycine soja revealed potent cytotoxic activity against A549 arid SK-MEL-2 cell lines. Moderate activity was observed in the extracts of Cassia obtusifolia and Glyeine max var. chungtae, and C. lineata and Vigna angulasis against SK-MEL-2 and HCT-15 cell lines, respectively. The other seed extracts were ineffective against model tumor cell lines. Because of their potent cytotoxic activities, the activity of each solvent fraction from C. lineata and G. soja was determined and the potent activity was produced from their chloroform fractions. As a naturally occurring therapeutic agent, leguminous seeds described could be useful for developing new types of anti-tumor agents.

  • PDF