• Title/Summary/Keyword: 음성압축

Search Result 218, Processing Time 0.024 seconds

CMSBS Extraction Using Periodicity-based Mel Sub-band Spectral Subtraction CMSBS Extraction (신호의 주기성에 따라 변형되는 스펙트럼 차감을 이용한 CMSBS)

  • Lee, Woo-Young;Lee, Sang-Ho;Hong, Jae-Keun
    • Proceedings of the KAIS Fall Conference
    • /
    • 2009.05a
    • /
    • pp.768-771
    • /
    • 2009
  • 현재 음성인식에서 가장 많이 사용하고 있는 특징벡터는 MFCC(Mel-Frequency Cepstral Coefficients)이다. 그러나 MFCC도 잡음이 존재하는 환경에서는 인식 성능이 저하된다. 이러한 MFCC의 단점을 해결하기 위해 mel sub-band 스펙트럼 차감법과 신호대잡음비에 따른 에너지 압축을 이용하는 CMSBS(Compression and Mel Sub-Band Spectral subtraction) 방법을 사용한다. 본 논문에서는 CMSBS 방법 적용 시 음성이 발성되는 구간과 묵음 구간에서 mel sub-band 스펙트럼 차감법이 동일한 조건으로 이루어져 발생하는 중요한 음성정보의 손실을 보완하기 위하여 신호의 주기성을 이용하여 spectral flooring 파라미터를 변형하는 방법을 제안한다. 제안한 방법으로 실험을 한 결과 잡음이 거의 없는 음성신호에 대해서는 기존의 방법과 비슷한 인식률을 가지고, 잡음성분이 많을수록 변형된 mel sub-band 스펙트럼 차감법을 적용한 방법이 인식률에서 보다 높은 성능 향상을 가져왔다.

  • PDF

Design and Manufacture of a Device for the Recognition of Long Vowels (장모음 인식장치 설계 제작)

  • 구용회
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.35T no.3
    • /
    • pp.9-14
    • /
    • 1998
  • The speech recognition on long vowels are carried out by electric circuits. A level compressor is able to transform the wave of voice to serial pulses. The obtained pulses have informations to distinguish the vowels. The sampling of the pulses is carried out by the register which picks up a series of serial signals in a pitch of a vowel as an unit. The timing control pulses such as sampling pulses are generated by using peak pulses in the speech wave. The parallel data in the register assign the phonetic symbol by means of the decision making circuit which carries out the IF-THEN rule.

  • PDF

A Study on a Analysis and Comparison of Preprocessing Technique for the Speech Compression (음성압축을 위한 전처리기법의 비교 분석에 관한 연구)

  • Jang, Kyung-A;Min, So-Yeon;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.125-136
    • /
    • 2003
  • Speech coding techniques have been studied to reduce the complexity and bit rate but also to improve the sound quality. CELP type vocoder, has used as a one of standard, supports the great sound quality even low bit rate. In this paper, the preprocessing of input speech to reduce the bit rate is the different with the conventional vocoder. The different kinds of parameter are used for the preprocessing so this paper is compared with theses parameters for finding the more appropriate parameter for the vocoder. The parameters are used to synthesize the speech not to encode or decode for coding technique so we proposed the simple algorithm not to have the influence on the processing time or the computation time. The parameters in used the preprocessing step are speaking rate, duration and PSOLA technique.

  • PDF

Development of Voice precription Using Fingerprint Authentification (지문인증을 이용한 음성처방전 개발)

  • 김재옥;조철환;장영건
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.424-426
    • /
    • 2001
  • 의약분업의 시작으로 환자의 약국에서의 대기 시간을 줄이기 위한 전자처방전이 활성화될 것이 예상되지만, 의사가 전자처방전 작성에 할애하는 노동력을 줄이기 위한 방법과 전자처방전의 신뢰성과 전송상의 환자 동의 여부를 증명하는 방식이 현재까지 나타나고 있지 않다. 본 연구에서는 이 문제를 지문인증, 음성인식과 압축기술을 이용하여 해결하는 방식을 제안하며, 실제적 구현을 통하여 그 타당성을 제시하였다. 실제 실험실 수준에서 시험한 결과 일반저긴 전자처방전에 비하여 처방전 작성시간이 20% 정도 절감되었으며, 복잡한 복약 지도의 경우에는 더욱 효과적일 것이다. 본 시험에서는 온라인 지문인증에 소요되는 시간은 제외하였다.

  • PDF

A Study on the TCM Transmission of Voice/Nonvoice Signals Modulated by DPSK through the 2-Wire Subscriber Loop (2-선식가입자 선로를 통해 DPSK로 변조된 음성 및 비음성 신호의 시간압축다중화 전송에 관한 연구)

  • 장청룡;강창언
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1986.04a
    • /
    • pp.107-112
    • /
    • 1986
  • This paper presents one method to resolve the end-to-end digital connectivity through 2-wire subscriber loop. This system which consists of the subscriber`s device and the line termination device makes use of the advantages of time compressed multiplexing and modified DPSK. Experimental results show that the transmission range of the lab test covers 2km and that of the field test covers 1.5km.

  • PDF

Syllable Reconition by HMM Using Segmental Statistics (세그멘트 통계량을 이용한 HMM 의 한국어 음절 인식)

  • 박창호
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.175-178
    • /
    • 1995
  • 기존이 연속 출력 분포형 HMM은 시계열의 과도적 변화에 대하여 표현 능력이 부족하다는 단점이 있다. 이것을 보완하기 위해 본 논문에서는 음성의 동적 변화를 반영하기 위한 특징 파라메타로서 여러 개의 프레임을 결합하여 세그멘트를 구성하여 각각에 대해 한 개의 벡터를 만들었다. 이것을 그대로 이용하면 세그멘트의 프레임수에 대응하는 파라메타의 차원수가 증가하기 때문에 학습 데이터가 불충분한 경우 모델의 파라메타를 잘 추정할 수 없으므로 K-L 전개로서 파라메타의 차원을 압축하여 파라메타수를 감소시켰다. 인식실험은 한국어 단음절에 대하여 멜켑스트럼ㅇ르 K-L 전개로 압축한 벡터를 이용한 결과와 멜켑스트럼, 멜켑스트럼 선형회귀계수를 파라메타로 이용한 경우를 비교하였다. 실험결과 K-L 전개로 압축한 벡터만을 이용한 경우는 멜켑스트럼 + 선형회귀계수를 파라메타로 이용한 경우보다 인식율이 낮앗으나 멜켑스트럼 + K-L 전개로 압축한 경우와 거의 동등한 결과를 얻을 수 있었다.

  • PDF

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

  • 양용호;이인성;권오주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7C
    • /
    • pp.962-970
    • /
    • 2004
  • This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.

Physical Characteristics of Red Pepper Powder by Cultivation Area and Variety (품종과 재배지역에 따른 고춧가루의 물리적 특성)

  • Oh, Seung-Hee;Kim, Hyun-Young;Hwang, Cho-Rong;Hwang, In-Guk;Hwang, Young;Yoo, Seon-Mi;Kim, Haeng-Ran;Kim, Hae-Young;Lee, Jun-Soo;Jeong, Heon-Sang
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.40 no.4
    • /
    • pp.599-605
    • /
    • 2011
  • This study investigated the physical properties of red pepper powders according to cultivation area and variety. Values for density, compressive characteristics, dynamic angle, irrecoverable work, and stress relaxation were analysed. Loose bulk density ranged between 0.40 and 0.50 g/$cm^3$, and tapped bulk density ranged between 0.49 and 0.67 g/$cm^3$. The highest Hausner ratio was 1.369 for PRmanitta cultivated in Eumseong and the lowest value of was 0.194 for Buchon cultivated in Yeongyang. Compressibility ranged between 0.0046 and 0.0092. The highest compression ratio was 1.040 for Myeongjak cultivated in Suwon, and the lowest value was 1.007 for Buchon cultivated in Yeongyang. Dynamic angles ranged between 35.14 and $41.70^{\circ}$. The highest irrecoverable work value was 79.9% for PRmanitta cultivated in Eumseong and the lowest value was 67.9% for Nokgwang cultivated in Suwon. The greatest $k_2$ and relaxation values of stress relaxation characteristics were 1.56 and 42.03%, respectively, for Cheongyang cultivated in Yeongyang.

Design of RTP/UDP/IP Header Compression Protocol in Wired Networks (유선망에서의 RTP/UDP/IP 헤더 압축 설계)

  • Kim Min-Yeong;Khongorzul D.;Shinn Byung-Cheol;Lee Insung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.8
    • /
    • pp.1696-1702
    • /
    • 2005
  • Real Time Transport Protocol (RTP) is the Internet standard protocol for transport of real time data audio/video IP Telephony, Multimedia Seivece. In case of 8kbps voice codec, the size of packet per data is 20bytes and become more large to minimal 40bytes with adding each layer's header in RTP/UDP/IP. To solve this problem, various header compression skill were suggested on point-to-point networks. But it compress even IP header and cannot be suitable to apply to end-to-end network Thus, We will renew header compression protocol to apply wired router-based network.

A Temporal Decomposition Method Based on a Rate-distortion Criterion (비트율-왜곡 기반 음성 신호 시간축 분할)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.315-322
    • /
    • 2002
  • In this paper, a new temporal decomposition method is proposed. which takes into consideration not only spectral distortion but also bit rates. The interpolation functions, which are one of necessary parameters for temporal decomposition, are obtained from the training speech corpus. Since the interval between the two targets uniquely defines the interpolation function, the interpolation can be represented without additional information. The locations of the targets are determined by minimizing the bit rates while the maximum spectral distortion maintains below a given threshold. The proposed method has been applied to compressing the LSP coefficients which are widely used as a spectral parameter. The results of the simulation show that an average spectral distortion of about 1.4 dB can be achieved at an average bit rate of about 8 bits/Frame.