Search | Korea Science

Feature Parameter Extraction and Speech Recognition Using Matrix Factorization (Matrix Factorization을 이용한 음성 특징 파라미터 추출 및 인식)

Lee Kwang-Seok;Hur Kang-In
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.7
- /
- pp.1307-1311
- /
- 2006
In this paper, we propose new speech feature parameter using the Matrix Factorization for appearance part-based features of speech spectrum. The proposed parameter represents effective dimensional reduced data from multi-dimensional feature data through matrix factorization procedure under all of the matrix elements are the non-negative constraint. Reduced feature data presents p art-based features of input data. We verify about usefulness of NMF(Non-Negative Matrix Factorization) algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment results, we confirm that proposed feature parameter is superior to MFCC(Mel-Frequency Cepstral Coefficient) in recognition performance that is used generally.
PDF KSCI

The Implementation of VoIP Terminal using PPTP for Voice Security (PPTP를 이용한 VoIP 음성보안 단말기 구현)

Kim, Sam-Taek
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.9 no.2
- /
- pp.73-80
- /
- 2009
Although it is relatively difficult to eavesdrop the commonly used PSTN in that it is connected with direct circuit, it is difficult to ensure the secret of call on Internet because many users can connect to the Internet at the same time. However, it is needed to ensure secret of voice call in a special situation. Due to the fact that many users can connect to the internet at the same time, VoIP can always be in a defenseless state by hackers. Therefore, in this paper, we have developed the increased voice security internet telephone terminal and measured conversation quality by adopting VPN PPTP based on SIP and using tunnel method in transmitting voice data to prevent eavesdrop of internet telephone.
PDF

Development of the IP-PBX with VPN function for voice security (VPN 기능을 가진 음성 보안용 IP-PBX 개발)

Kim, Sam-Taek
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.10 no.6
- /
- pp.63-69
- /
- 2010
Today, Internet Telephony Services based on VoIP are gaining tremendous popularity for general user. Therefore a various demands of the user keep up increase, the most important requirements of these is voice security about telephony system. It is needed to ensure secret of voice call in a special situation. Due to the fact that many users can connect to the internet at the same time, VoIP can always be in a defenseless state by hackers. Therefore, in this paper, we have developed VPN IP-PBX for the voice security and measured conversation quality by adopting VPN IPsec based on SIP and using tunnel method in transmitting voice data to prevent eavesdrop of voice data. This VPN IP-PBX that is connected Soft-phone provide various optional services.
PDF KSCI

A Study on the Redundancy Reduction in Speech Recognition (음성인식에서 중복성의 저감에 대한 연구)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.7 no.3
- /
- pp.475-483
- /
- 2012
The characteristic features of speech signal do not vary significantly from frame to frame. Therefore, it is advisable to reduce the redundancy involved in the similar feature vectors. The objective of this paper is to search for the optimal condition of minimum redundancy and maximum relevancy of the speech feature vectors in speech recognition. For this purpose, we realize redundancy reduction by way of a vigilance parameter and investigate the resultant effect on the speaker-independent speech recognition of isolated words by using FVQ/HMM. Experimental results showed that the number of feature vectors might be reduced by 30% without deteriorating the speech recognition accuracy.
https://doi.org/10.13067/JKIECS.2012.7.3.475 인용 PDF KSCI

Comparative Studies on the Self Voice Assessment of Voice Disorder Patients and the Hearer Voice Assessment of a Comparative Group of normal subjects (음성장애인의 자가음성평가와 정상음성인의 청자음성평가 특성 비교)

Lee, Yu-Jin;Hwang, Young-Jin
- Phonetics and Speech Sciences
- /
- v.4 no.2
- /
- pp.105-114
- /
- 2012
This paper will discuss the difference between self assessment of voice disorders and the hearer voice assessment of a comparative group of normal subjects. The study was conducted on 25 voice disorder subjects and 32 hearers of a comparative group of normal subjects. The results are as follows. Firstly, in K-VHI and VHI-H, the hearers of the comparative group of normal subjects perceived more serious voice disorders than the voice disorder group in all sub-domains. Likewise, in K-VQOL and VRQOL-H, the hearers of the comparative group of normal subjects perceived more serious voice disorders than the voice disorder group in all sub-domains. Secondly, the hearer voice assessment of the comparative group of normal subjects showed no difference in gender regarding the perception of the severity of voice disorder issues. Thirdly, the hearer voice assessment of the comparative group of normal subjects states that in the emotional aspects of VHI-H, professional voice users perceive more serious voice disorders than others. Accordingly, in VRQOL-H, there was no difference in use of the voice between professionals and others.
https://doi.org/10.13064/KSSS.2012.4.2.105 인용 PDF

Analysis of Eigenvalues of Covariance Matrices of Speech Signals in Frequency Domain for Various Bands (음성 신호의 주파수 영역에서의 주파수 대역별 공분산 행렬의 고유값 분석)

Kim, Seonil
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.05a
- /
- pp.293-296
- /
- 2016
Speech Signals consist of signals of consonants and vowels, but the lasting time of vowels is much longer than that of consonants. It can be assumed that the correlations between signal blocks in speech signal is very high. But the correlations between signal blocks in various frequency bands can be quite different. Each speech signal is divided into blocks which have 128 speech data. FFT is applied to each block. Various frequency areas of the results of FFT are taken and Covariance matrix between blocks in a speech signal is extracted and finally eigenvalues of those matrix are obtained. It is studied that in the eigenvalues of various frequency bands which band can be used to get more reliable result.
PDF

Design and Implementation of the Mobile Lecture Support System (모바일 기반 강의 지원 시스템 설계 및 구현)

Kim, JunSik;Choi, YoungGil;Park, Suhyun
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.05a
- /
- pp.457-459
- /
- 2016
This lecture support system can be used to smoothly transfer the sound in the auditorium or the particular outdoor space. Commercial lecture support system is useful but very expensive. Therefore in this paper we have designed and developed lecture support system using a mobile phone. We can exactly hear the lecture in difficult situation using this system. The system provides the ability to save a lecture in the storage, so we can hear the lecture repeatedly.
PDF

An Endpoint Detection Algorithm for Noise Speech using Band Energy (대역에너지를 이용한 잡음음성의 끝점검출 알고리즘)

Park Ki-Sang;Suk Su-Young;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.91-94
- /
- 2002
음성인식 시스템의 실용화를 위해서 우선적으로 해결되어야 될 문제중 하나로 잡음환경하에서의 끝점검출을 들 수 있다. 잡음이 존재하지 않는 환경에서는 기존의 에너지 파라미터만으로도 어느정도 신뢰성있는 끝점 구간을 검출할 수 있으나 도심 소음과 같은 실제 잡음환경하에서는 대부분 좋지 않은 결과를 보인다. 본 논문에서는 도심환경의 배경잡음을 제거하는 방법으로 입력되는 음성에 대하여 주변소음에 의해 손상된 음성스펙트럼의 크기 성분만을 제거하는 전처리 기법인 Bark scale에 기반한 스펙트럼 차감법을 사용하고, 인간의 청각특성을 고려하여 음성의 주파수 대역을 3개의 대역으로 분리한 후, 대역별로 세밀한 에너지 문턱치값을 설정하여 음성의 끝점을 탐색하는 방법을 제안한다. 제안한 방법의 유효성을 확인하기 위해 실제 사무실 및 지하철역 등의 잡음환경하에서 녹음된 데이터베이스를 이용하여 끝점검출을 수행한 결과 기존의 에너지와 영교차율을 이용한 방법에 비해 평균 $46\%$의 오차율 감소와 대역에너지만을 사용한 경우에 비해 평균 $17\%$의 오차율 감소를 나타내어 제안한 방법의 유효성을 확인할 수 있었다.
PDF

Implementation of Speaker Independent Speech Recognition System Using Independent Component Analysis based on DSP (독립성분분석을 이용한 DSP 기반의 화자 독립 음성 인식 시스템의 구현)

김창근;박진영;박정원;이광석;허강인
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.8 no.2
- /
- pp.359-364
- /
- 2004
In this paper, we implemented real-time speaker undependent speech recognizer that is robust in noise environment using DSP(Digital Signal Processor). Implemented system is composed of TMS320C32 that is floating-point DSP of Texas Instrument Inc. and CODEC for real-time speech input. Speech feature parameter of the speech recognizer used robust feature parameter in noise environment that is transformed feature space of MFCC(met frequency cepstral coefficient) using ICA(Independent Component Analysis) on behalf of MFCC. In recognition result in noise environment, we hew that recognition performance of ICA feature parameter is superior than that of MFCC.
PDF KSCI

Glottal Spectrum Analysis According to Speaking volume (발성크기에 따른 Glottal Spectrum 성분 분석)

Lee Yoonjoo;Cho Namsu;Bae Myungjin
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.53-56
- /
- 2001
사람은 연령, 성별 등에 따라 성도(vocal tract), 성대(vocal cord, 혹은 vocal fold), 비강(nasal tract)등 발성기관의 차이가 있고, 이는 음성의 음색, 높낮이 등 음향 특성에 영향을 미치며, 시간이 지나감에 따라 변하는 특성을 가지고 있다. 예를 들어, 발성기관의 차이가 큰 남성과 여성은 동일한 단어를 발성하더라도 음향학적으로 매우 큰 차이를 보이며, 이러한 특성은 다른 문장 발성 시에도 음향학적으로 일정한 영향을 미치게 되므로 정적특성이라 한다. 본 논문에서는 이러한 정적특성 중 음성의 발성크기에 따른 Glottal Spectrum을 비교 $\cdot$분석 하고자 한다.
PDF

Search Result 1,834, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)