• Title/Summary/Keyword: Speech transmission

Search Result 155, Processing Time 0.028 seconds

On a Study of Measurement Method of Utterance Velocity for the Reduction of Transmission Rate in CELP Vocoder. (LSP 파라미터를 이용한 발성측정법)

  • 장경아;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.11d
    • /
    • pp.199-202
    • /
    • 2000
  • Speaking Rate has variety depends on the situation and habit of speakers. It has been many studied about speaking rate In speaker recognition. The study of speaking rate in speech recognition is one of considerable matter when It is recognized the speakers and it is measured by many speech data base and complicate estimation for accuracy. In this paper, conventional vocoder process the speech signal when encoding and transmitting without regard to speaking rate so in order to apply the speaking rate for vocoder It should be considered the simpler algorithm and less computation amount than the conventional method of speaking rate used In speech recognition. We proposed the speaking rate algorithm which is used the simple parameter with Line Spectrum Pair (LSP). The proposed peaking rate method is measured by the information of LSP in speech. We measured the variety rate of phenomenon about utterances which have different velocity, respectively. As a result, It has distinct variation rate of phenomenon between utterances uttered fast and slow and the rate is 42.8% higher in case of uttered fast than in case of uttered slow.

  • PDF

Adaptive Speech Streaming Based on Packet Loss Prediction Using Support Vector Machine for Software-Based Multipoint Control Unit over IP Networks

  • Kang, Jin Ah;Han, Mikyong;Jang, Jong-Hyun;Kim, Hong Kook
    • ETRI Journal
    • /
    • v.38 no.6
    • /
    • pp.1064-1073
    • /
    • 2016
  • An adaptive speech streaming method to improve the perceived speech quality of a software-based multipoint control unit (SW-based MCU) over IP networks is proposed. First, the proposed method predicts whether the speech packet to be transmitted is lost. To this end, the proposed method learns the pattern of packet losses in the IP network, and then predicts the loss of the packet to be transmitted over that IP network. The proposed method classifies the speech signal into different classes of silence, unvoiced, speech onset, or voiced frame. Based on the results of packet loss prediction and speech classification, the proposed method determines the proper amount and bitrate of redundant speech data (RSD) that are sent with primary speech data (PSD) in order to assist the speech decoder to restore the speech signals of lost packets. Specifically, when a packet is predicted to be lost, the amount and bitrate of the RSD must be increased through a reduction in the bitrate of the PSD. The effectiveness of the proposed method for learning the packet loss pattern and assigning a different speech coding rate is then demonstrated using a support vector machine and adaptive multirate-narrowband, respectively. The results show that as compared with conventional methods that restore lost speech signals, the proposed method remarkably improves the perceived speech quality of an SW-based MCU under various packet loss conditions in an IP network.

Robust Speech Reinforcement Based on Gain-Modification incorporating Speech Absence Probability (음성 부재 확률을 이용한 음성 강화 이득 수정 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.1
    • /
    • pp.175-182
    • /
    • 2010
  • In this paper, we propose a robust speech reinforcement technique to enhance the intelligibility of the degraded speech signal under the ambient noise environments based on soft decision scheme incorporating a speech absence probability (SAP) with speech reinforcement gains. Since the ambient noise significantly decreases the intelligibility of the speech signal, the speech reinforcement approach to amplify the estimated clean speech signal from the background noise environments for improving the intelligibility and clarity of the corrupted speech signal was proposed. In order to estimate the robust reinforcement gain rather than the conventional speech reinforcement method between speech active periods and nonspeech periods or transient intervals, we propose the speech reinforcement algorithm based on soft decision applying the SAP to the estimation of speech reinforcement gains. The performances of the proposed algorithm are evaluated by the Comparison Category Rating (CCR) of the measurement for subjective determination of transmission quality in ITU-T P.800 under various ambient noise environments and show better performances compared with the conventional method.

Characteristics of Acoustic Indicators Evaluating Speech Intelligibility in Korean Elementary School Classrooms (초등학교 일반교실의 음향성능 실태측정 및 평가지표 특성 고찰)

  • Lee, Seong-Bok;Kim, Myung-Jun;Yang, Hong-Seok
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.25 no.7
    • /
    • pp.462-469
    • /
    • 2015
  • This study was carried out to examine characteristics of various acoustic indicators evaluating speech intelligibility such as reverberation time(T30), D50, C50 and speech transmission index(STI) in Korean elementary school classrooms. Results showed that mean T30 at middle frequencies(500 Hz to 2000 Hz) measured in 9 classrooms was about 0.75 s, which exceeds a regulation specified on American National Standards(ANSI); 0.60 s. Mean D50, C50 and STI were 60 % to 66 %, +2 dB to +3 dB, and 0.65, respectively. The maximum difference in D50 and C50 according to different receiver points in a classroom was 13 % and 2.5 dB, while the maximum difference in T30 was 0.03 s. Whereas STI measured in classrooms has relatively low correlation with other indicators, correlation between D50 and C50 was high, R2=.9964. In addition, T30 and C50 were fitted well as logarithmic regression curve with R2=.9610. It was +3.73 dB in C50 and 68 % in D50 which are the value corresponding to 0.60 s in T30 on this curve.

Speech Recognition based Message Transmission System for the Hearing Impaired Persons (청각장애인을 위한 음성인식 기반 메시지 전송 시스템)

  • Kim, Sung-jin;Cho, Kyoung-woo;Oh, Chang-heon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1604-1610
    • /
    • 2018
  • The speech recognition service is used as an ancillary means of communication by converting and visualizing the speaker's voice into text to the hearing impaired persons. However, in open environments such as classrooms and conference rooms it is difficult to provide speech recognition service to many hearing impaired persons. For this, a method is needed to efficiently provide it according to the surrounding environment. In this paper, we propose a system that recognizes the speaker's voice and transmits the converted text to many hearing impaired persons as messages. The proposed system uses the MQTT protocol to deliver messages to many users at the same time. The end-to-end delay was measured to confirm the service delay of the proposed system according to the QoS level setting of the MQTT protocol. As a result of the measurement, the delay between the most reliable Qos level 2 and 0 is 111ms, confirming that it does not have a great influence on conversation recognition.

The V/UV Decision Algorithm for a Reduction of the Transmission Bit Rate in the CELP Vocoder (CELP 음성부호화기 전송률 감소를 위한 음성신호의 V/UV 결정 알고리즘)

  • Min, So-Yeon;Kim, Hyun-Chul
    • Journal of Advanced Navigation Technology
    • /
    • v.11 no.1
    • /
    • pp.87-92
    • /
    • 2007
  • The conventional CELP(code excited linear prediction) type vocoder has no V/UV(voiced/unvoiced) classifier. So, the unvoiced speech is processed like the voiced speech. In this paper, to reduce the bit rate, we propose a new V/UV decision algorithm minimized error rate and preprocessing computation. This V/UV classifier use the LSP(line spectrum pair) parameter which is acquired spectrum analysis process in CELP vocoders. Applying this method to the 5.3kbps ACELP(algebraic code excited linear prediction) in the G.723.1, we can get the transmission bits rate reduction of 6% approximately without degradation of speech quality.

  • PDF

An Experimental Research on the Room Acoustical Environment of the Elementary School Classrooms (초등학교 교실의 음환경 평가에 관한 실험적 연구)

  • Haan, Chan-Hoon;Moon, Kyu-Chun
    • Journal of the Korean Institute of Educational Facilities
    • /
    • v.11 no.1
    • /
    • pp.5-14
    • /
    • 2004
  • Since 1990s in Korea, elementary school classrooms have been designed toward open education system in pursuit of variety of educational purpose. Also, the architectural designs of schools have been acomplished for individual school not based on the standard design code. The present paper aims to investigate the acoustic environment of existing classrooms and to compare the sound insulation capacity between the ordinary classrooms and the newly built classrooms for open education. The current acoustical situation of elementary classrooms was analyzed using field measurements and questionnaire survey. In order to this, Three elementary schools were selected which were built in 1978, 1996 and 2000 respectively. Room acoustical parameters including Reverberation time(RT), Definition(D50), Speech Intelligibility(RASTI), Transmission loss(TL) and STC were measured in a classroom in each elementary school classroom. Each measurement was undertaken with the windows and doors being open or closed. As the result, it was found that the transmission loss between rooms in open classrooms is, $5{\sim}6dB$ in average, inferior than the ordinary classrooms. The RASTI of 0.70 was measured in newly built classrooms which is better than old classrooms(0.70) and open classrooms(0.73). This was shown as same in the speech definition measurements. This results from the condition of sealing and airtightness of classrooms and floor materials. The results denote that open classrooms have poor acoustic condition in sound insulation and speech intelligibility.

Measurement and evaluation of speech privacy in university office rooms (대학 내 사무실의 스피치 프라이버시 측정 및 평가)

  • Lim, Jae-Seop;Choi, Young-Ji
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.4
    • /
    • pp.396-405
    • /
    • 2019
  • The speech privacy of closed office rooms located in a university campus was measured and assessed in terms of SPC (Speech Privacy Class) values. The measurements of two quantities, the LD (Level Difference) between a source and a receiving room, and the background noise level ($L_b$) at the receiving room were carried out in 5 rooms located in 3 different buildings in the university campus. Each of the 5 rooms was adjacent to both offices and corridors through walls. The TL (Transmission Loss) between the source and the receiver room was also measured to compare the difference of two standard methods, ASTM E2836-10 and KS F 2809. The present results show that the speech privacy of the 5 office rooms is not met the requirement for a minimum SPC values of 70. A minimum LD value of 41 dB between the source and the receiver room should be achieved for having a SPC value of 70 when the mean measured value of $L_b$ at the receiving room is 29.2 dB. That is, the TL(avg) value averaged over the octave bands from 160 Hz to 5000 Hz between the source and the receiver room should be or greater than 40 dB. The most important architectural factor influencing the LD value is the presence of openings, such as doors, and windows, on the adjacent walls between the source and receiving room. Therefore, if the opening of the adjacent wall is replaced by an opening with high sound insulation, the appropriate SPC value of the research and office rooms can be achieved.

Perceptual Characteristics of Korean Vowels Distorted by the Frequency Band Limitation (주파수 대역 제한에 의한 한국어 모음의 지각 특성 분석)

  • Kim, YeonWhoa;Choi, DaeLim;Lee, Sook-Hyang;Lee, YongJu
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.85-93
    • /
    • 2014
  • This paper investigated the effects of frequency band limitation on perceptual characteristics of Korean vowels. Monosyllabic speech (144 syllables of CV type, 56 syllables of VC type, 8 syllables of V type) produced by two announcers were low- and high-pass filtered with cutoff frequencies ranging from 300 to 5000 Hz. Six listeners with normal hearing performed perception tests by types of filter and cutoff frequencies. We reported phoneme recognition rates and types of perception error of band-limited Korean vowels to examine how frequency distortion in the process of speech transmission affect listener's perception.

The relevancy between Physical index and subjective appraisal of classrooms (강의실 내의 물리지표와 주관적 평가와의 상관관계)

  • Lee, Chai-Bong;Kim, Yong-Man
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2002.11b
    • /
    • pp.743-748
    • /
    • 2002
  • The eventual Purpose of this research is to make optimum standards for acoustic-environment by using not only physical characteristics but also subjective appraisals. Basic physical data were measured which were necessary to establish standards for acoustic environment in campus buildings, TSP has used to measure sound levels, reverberation times, clearness indexes, and speech-transmission-index. In addition to physical characteristics, questionnaires were given to university students to given subjective appraisals. For instance, questions about volume or clearness of lectures. The relevancy between physical characteristics and subjective appraisals was studied.

  • PDF