Search | Korea Science

Comparison of Speech Intelligibility & Performance of Speech Recognition in Real Driving Environments (자동차 주행 환경에서의 음성 전달 명료도와 음성 인식 성능 비교)

Lee Kwang-Hyun;Choi Dae-Lim;Kim Young-Il;Kim Bong-Wan;Lee Yong-Ju
- MALSORI
- /
- no.50
- /
- pp.99-110
- /
- 2004
The normal transmission characteristics of sound are hardly obtained due to the various noises and structural factors in a running car environment. It is due to the channel distortion of the original source sound recorded by microphones, and it seriously degrades the performance of the speech recognition in real driving environments. In this paper we analyze the degree of intelligibility under the various sound distortion environments by channels according to driving speed with respect to speech transmission index(STI) and compare the STI with rates of speech recognition. We examine the correlation between measures of intelligibility depending on sound pick-up patterns and performance in speech recognition. Thereby we consider the optimal location of a microphone in single channel environment. In experimentation we find that high correlation is obtained between STI and rates of speech recognition.
PDF

A Study of Speech Coding for the Transmission on Network by the Wavelet Packets (Wavelet Packet을 이용한 Network 상의 음성 코드에 관한 연구)

Baek, Han-Wook;Chung, Chin-Hyun
- Proceedings of the KIEE Conference
- /
- 2000.07d
- /
- pp.3028-3030
- /
- 2000
In general. a speech coding is dedicated to the compression performance or the speech quality. But. the speech coding in this paper is focused on the performance of flexible transmission to the, network speed. For this. the subbanding coding is needed. which is used the wavelet packet concept in the signal analysis. The extraction of each frequency-band is difficult to general signal analysis methods, after coding each band, the reconstruction of these is also a difficult problem. But. with the wavelet packet concept(perfect reconstruction) and its fast computation algorithm. the extraction of each band and the reconstruction are more natural. Also, this paper describes a direct solution of the voice transmission on network and implement this algorithm at the TCP/IP network environment of PC.
PDF

Detection and Synthesis of Transition Parts of The Speech Signal

Kim, Moo-Young
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.3C
- /
- pp.234-239
- /
- 2008
For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.
PDF KSCI

Performance Estimation of a Window Shaker (유리창 도청방지 장치의 성능평가)

Kim, Seock-Hyun;Kim, Hee-Dong;Heo, Wook
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2007.05a
- /
- pp.649-654
- /
- 2007
Eavesdropping prevention performance is evaluated on a commercial window shaker, which is used to prevent a glass window from eavesdropping. Speech transmission index (STI) is introduced in order to estimate quantitatively the speech intelligibility of the sound detected on the glass window. Objective test by IEC standard using modulation transfer function (MTF) is performed to determine STI. Using Maximum Length Sequency (MLS) signal as a sound source, MTF is measured by accelerometers and laser doppler vibrometer. STI under different level of disturbing wave are compared to confirm the disturbing effect on the speech intelligibility.
PDF

A Study on the Objective Evaluation Model of Telephone Transmission Quality (통화품질 객관평가 모델링에 관한 연구)

조재철;박순영;방만원
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.16 no.6
- /
- pp.509-516
- /
- 1991
In this paper, we propose on objective evaluation model of telephone transmission qulity in order to estimate a satisfaction score regarding speech quality in a relephone network. As the degradantion factors of telephone transmission quality, this model takes into account transmission loss, noise, distortion, talker echo and sidetone. A performance index[PI] is introduced for five psychological factors affecting telephone speech qualty, and a Mean Opinion Score(MOS) is estimated from the sum of all Pis. The simulation results indicate theat the MOS obtained from the objective evaluation model is in good agreement with that of subjective evaluation.
PDF

A Study on Measuring the Speaking Rate of Speaking Signal by Using Line Spectrum Pair Coefficients

Jang, Kyung-A;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3E
- /
- pp.18-24
- /
- 2001
Speaking rate represents how many phonemes in speech signal have in limited time. It is various and changeable depending on the speakers and the characters of each phoneme. The preprocessing to remove the effect of variety of speaking rate is necessary before recognizing the speech in the present speech recognition systems. So if it is possible to estimate the speaking rate in advance, the performance of speech recognition can be higher. However, the conventional speech vocoder decides the transmission rate for analyzing the fixed period no regardless of the variety rate of phoneme but if the speaking rate can be estimated in advance, it is very important information of speech to use in speech coding part as well. It increases the quality of sound in vocoder as well as applies the variable transmission rate. In this paper, we propose the method for presenting the speaking rate as parameter in speech vocoder. To estimate the speaking rate, the variety of phoneme is estimated and the Line Spectrum Pairs is used to estimate it. As a result of comparing the speaking rate performance with the proposed algorithm and passivity method worked by eye, error between two methods is 5.38% about fast utterance and 1.78% about slow utterance and the accuracy between two methods is 98% about slow utterance and 94% about fast utterances in 30 dB SNR and 10 dB SNR respectively.
PDF

Speech Encryption Scheme Using Frequency Band Scrambling (대역 스크램블을 이용한 음성 보호방식)

Ji, Hyung-Kun;Lee, Dong-Wook
- Proceedings of the KIEE Conference
- /
- 1999.11c
- /
- pp.700-702
- /
- 1999
The protection of data which we want to keep secret from invalid users has become a main topic nowadays. This paper introduces a encryption scheme for protecting speech signals from eavesdropping. The proposed encryption scheme adopts a secure voice cryptographic algorithm based on the scrambling in frequency band. In order to improve the conventional speech signal encryption scheme, we have randomly permuted DCT coefficients of speech signal. Simulation results are included to show the performance of the proposed algorithm for secure transmission of speech signals.
PDF

Eavesdropping of the Glass Window Using a Laser Sensor and Performance Estimation of a Window Shaker (레이저센서를 이용한 유리창 도청 및 도청방지기의 성능 평가)

Kim, Seock-Hyun;Heo, Wook;Kim, Hee-Dong
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2008.04a
- /
- pp.551-556
- /
- 2008
Possibility of the remote eavesdropping through window glass is investigated using a laser sensor. Various thicknesses and types of glass windows are excited by maximum length sequency (MLS) signal and the vibration sound is detected by a laser doppler vibrometer. Intelligibility of the detected sound is evaluated using the speech transmission index (STI), which is based on the modulation transfer function (MTF). In order to identify the disturbing effect, different level of disturbing wave is generated by an outside speaker and a window shaker attached on the glass window. On the different thickness of glass windows, decrease effect of the speech intelligibility is analysed.
PDF

Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech (채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상)

Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung
- MALSORI
- /
- no.44
- /
- pp.73-82
- /
- 2002
Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.
PDF

Comparison of acoustics performance measurement and evaluation standard of office space and office acoustics criteria of European countries (사무공간의 음향성능 측정, 평가 방법의 표준화와 유럽 국가들의 음향성능 기준 비교)

Jeong-Ho Jeong
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.2
- /
- pp.133-142
- /
- 2023
The office environment is changing according to work types, Information Technology (IT) advancements, and the Coronavirus disease (COVID)-19 situation. In order for office space users to perform their tasks comfortably and efficiently, it is necessary to secure individual privacy as well as easy communication among members. In Korea, the demand for improving the acoustic performance of office spaces is also increasing, but the related performance criteria and guidelines have not been established. In this study, standardization of office space acoustic performance measurement and evaluation methods and European countries' acoustic performance criteria were compared and reviewed. It is proposed to comprehensively review international standardization trends and acoustic performance standards in each country and to establish and utilize criteria for evaluating the acoustic performance and satisfaction of office spaces in Korea through our survey. Considering the international standardization direction and compatibility with communication and Public Address (PA) systems, it is appropriate to establish criteria using the speech transmission index or Speech Transmission Index (STI) application index. This criterion will be highly utilizable and compatible. In addition, since the office furniture industry is interested in improving the acoustic performance of office space, it is necessary to establish a labelling system for speech level reduction of office furniture.
https://doi.org/10.7776/ASK.2023.42.2.133 인용 PDF

Search Result 56, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)