• 제목/요약/키워드: Speech transmission performance

검색결과 56건 처리시간 0.022초

자동차 주행 환경에서의 음성 전달 명료도와 음성 인식 성능 비교 (Comparison of Speech Intelligibility & Performance of Speech Recognition in Real Driving Environments)

  • 이광현;최대림;김영일;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제50호
    • /
    • pp.99-110
    • /
    • 2004
  • The normal transmission characteristics of sound are hardly obtained due to the various noises and structural factors in a running car environment. It is due to the channel distortion of the original source sound recorded by microphones, and it seriously degrades the performance of the speech recognition in real driving environments. In this paper we analyze the degree of intelligibility under the various sound distortion environments by channels according to driving speed with respect to speech transmission index(STI) and compare the STI with rates of speech recognition. We examine the correlation between measures of intelligibility depending on sound pick-up patterns and performance in speech recognition. Thereby we consider the optimal location of a microphone in single channel environment. In experimentation we find that high correlation is obtained between STI and rates of speech recognition.

  • PDF

Wavelet Packet을 이용한 Network 상의 음성 코드에 관한 연구 (A Study of Speech Coding for the Transmission on Network by the Wavelet Packets)

  • 백한욱;정진현
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.3028-3030
    • /
    • 2000
  • In general. a speech coding is dedicated to the compression performance or the speech quality. But. the speech coding in this paper is focused on the performance of flexible transmission to the, network speed. For this. the subbanding coding is needed. which is used the wavelet packet concept in the signal analysis. The extraction of each frequency-band is difficult to general signal analysis methods, after coding each band, the reconstruction of these is also a difficult problem. But. with the wavelet packet concept(perfect reconstruction) and its fast computation algorithm. the extraction of each band and the reconstruction are more natural. Also, this paper describes a direct solution of the voice transmission on network and implement this algorithm at the TCP/IP network environment of PC.

  • PDF

Detection and Synthesis of Transition Parts of The Speech Signal

  • Kim, Moo-Young
    • 한국통신학회논문지
    • /
    • 제33권3C호
    • /
    • pp.234-239
    • /
    • 2008
  • For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

유리창 도청방지 장치의 성능평가 (Performance Estimation of a Window Shaker)

  • 김석현;김희동;허욱
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2007년도 춘계학술대회논문집
    • /
    • pp.649-654
    • /
    • 2007
  • Eavesdropping prevention performance is evaluated on a commercial window shaker, which is used to prevent a glass window from eavesdropping. Speech transmission index (STI) is introduced in order to estimate quantitatively the speech intelligibility of the sound detected on the glass window. Objective test by IEC standard using modulation transfer function (MTF) is performed to determine STI. Using Maximum Length Sequency (MLS) signal as a sound source, MTF is measured by accelerometers and laser doppler vibrometer. STI under different level of disturbing wave are compared to confirm the disturbing effect on the speech intelligibility.

  • PDF

통화품질 객관평가 모델링에 관한 연구 (A Study on the Objective Evaluation Model of Telephone Transmission Quality)

  • 조재철;박순영;방만원
    • 한국통신학회논문지
    • /
    • 제16권6호
    • /
    • pp.509-516
    • /
    • 1991
  • In this paper, we propose on objective evaluation model of telephone transmission qulity in order to estimate a satisfaction score regarding speech quality in a relephone network. As the degradantion factors of telephone transmission quality, this model takes into account transmission loss, noise, distortion, talker echo and sidetone. A performance index[PI] is introduced for five psychological factors affecting telephone speech qualty, and a Mean Opinion Score(MOS) is estimated from the sum of all Pis. The simulation results indicate theat the MOS obtained from the objective evaluation model is in good agreement with that of subjective evaluation.

  • PDF

A Study on Measuring the Speaking Rate of Speaking Signal by Using Line Spectrum Pair Coefficients

  • Jang, Kyung-A;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • 제20권3E호
    • /
    • pp.18-24
    • /
    • 2001
  • Speaking rate represents how many phonemes in speech signal have in limited time. It is various and changeable depending on the speakers and the characters of each phoneme. The preprocessing to remove the effect of variety of speaking rate is necessary before recognizing the speech in the present speech recognition systems. So if it is possible to estimate the speaking rate in advance, the performance of speech recognition can be higher. However, the conventional speech vocoder decides the transmission rate for analyzing the fixed period no regardless of the variety rate of phoneme but if the speaking rate can be estimated in advance, it is very important information of speech to use in speech coding part as well. It increases the quality of sound in vocoder as well as applies the variable transmission rate. In this paper, we propose the method for presenting the speaking rate as parameter in speech vocoder. To estimate the speaking rate, the variety of phoneme is estimated and the Line Spectrum Pairs is used to estimate it. As a result of comparing the speaking rate performance with the proposed algorithm and passivity method worked by eye, error between two methods is 5.38% about fast utterance and 1.78% about slow utterance and the accuracy between two methods is 98% about slow utterance and 94% about fast utterances in 30 dB SNR and 10 dB SNR respectively.

  • PDF

대역 스크램블을 이용한 음성 보호방식 (Speech Encryption Scheme Using Frequency Band Scrambling)

  • 지형근;이동욱
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1999년도 추계학술대회 논문집 학회본부 B
    • /
    • pp.700-702
    • /
    • 1999
  • The protection of data which we want to keep secret from invalid users has become a main topic nowadays. This paper introduces a encryption scheme for protecting speech signals from eavesdropping. The proposed encryption scheme adopts a secure voice cryptographic algorithm based on the scrambling in frequency band. In order to improve the conventional speech signal encryption scheme, we have randomly permuted DCT coefficients of speech signal. Simulation results are included to show the performance of the proposed algorithm for secure transmission of speech signals.

  • PDF

레이저센서를 이용한 유리창 도청 및 도청방지기의 성능 평가 (Eavesdropping of the Glass Window Using a Laser Sensor and Performance Estimation of a Window Shaker)

  • 김석현;허욱;김희동
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2008년도 춘계학술대회논문집
    • /
    • pp.551-556
    • /
    • 2008
  • Possibility of the remote eavesdropping through window glass is investigated using a laser sensor. Various thicknesses and types of glass windows are excited by maximum length sequency (MLS) signal and the vibration sound is detected by a laser doppler vibrometer. Intelligibility of the detected sound is evaluated using the speech transmission index (STI), which is based on the modulation transfer function (MTF). In order to identify the disturbing effect, different level of disturbing wave is generated by an outside speaker and a window shaker attached on the glass window. On the different thickness of glass windows, decrease effect of the speech intelligibility is analysed.

  • PDF

채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상 (Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech)

  • 김민성;정성윤;손종목;배건성
    • 대한음성학회지:말소리
    • /
    • 제44호
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

사무공간의 음향성능 측정, 평가 방법의 표준화와 유럽 국가들의 음향성능 기준 비교 (Comparison of acoustics performance measurement and evaluation standard of office space and office acoustics criteria of European countries)

  • 정정호
    • 한국음향학회지
    • /
    • 제42권2호
    • /
    • pp.133-142
    • /
    • 2023
  • 업무 형태 변화와 Information Technology(IT) 기술 발전 그리고 Coronavirus disease(COVID)-19 상황 등에 따라 사무환경도 변화되고 있다. 사무공간 사용자가 쾌적하고 효율적으로 업무를 수행하기 위해서는 구성원 사이의 교류는 물론 개인의 프라이버시 확보가 필요하다. 우리나라도 사무공간의 음향성능 개선에 대한 요구가 증가하고 있으나, 관련 성능 기준과 지침은 수립되어 있지 않은 실정이다. 본 연구에서는 사무공간 음향성능 측정, 평가 방법의 표준화와 유럽 국가들의 음향성능 기준을 비교, 검토하였다. 국제표준화 동향과 각국의 음향성능 기준을 종합적으로 검토하고 우리나라 사무공간 음향 실태 조사 등을 통해 사무공간 음향성능과 만족도 평가 기준을 수립하여 활용하는 것을 제안한다. 국제표준화 방향과 통신, 전기음향 시스템과의 호환 등을 고려하여 음성전달지수 또는 음성전달지수 응용지표를 활용한 기준을 수립하는 것이 적절하고 활용도와 호환성이 높을 것으로 판단된다. 또한, 사무용 가구류 업계에서도 사무공간의 음향성능 개선에 관심을 나타내고 있어, 사무용 가구류의 음성 레벨 저감량에 관한 성능 기준을 수립하고 표시하는 방안을 마련하는 것이 필요하다.