• Title/Summary/Keyword: Simulated speech

Search Result 70, Processing Time 0.028 seconds

VR-simulated Sailor Training Platform for Emergency (긴급상황에 대한 가상현실 선원 훈련 플랫폼)

  • Park, Chur-Woong;Jung, Jinki;Yang, Hyun-Seung
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2015.10a
    • /
    • pp.175-178
    • /
    • 2015
  • This paper presents a VR-simulated sailor training platform for emergency in order to prevent a human error that causes 60~80% of domestic/ abroad marine accidents. Through virtual reality technology, the proposed platform provides an interaction method for proficiency of procedures in emergency, and a crowd control method for controlling crowd agents in a virtual ship environment. The interaction method uses speech recognition and gesture recognition to enhance the immersiveness and efficiency of the training. The crowd control method provides natural simulations of crowd agents by applying a behavior model that reflects the social behavior model of human. To examine the efficiency of the proposed platform, a prototype whose virtual training scenario describes the outbreak of fire in a ship was implemented as a standalone system.

  • PDF

Formant-broadened CMS Using the Log-spectrum Transformed from the Cepstrum (켑스트럼으로부터 변환된 로그 스펙트럼을 이용한 포먼트 평활화 켑스트럴 평균 차감법)

  • 김유진;정혜경;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.361-373
    • /
    • 2002
  • In this paper, we propose a channel normalization method to improve the performance of CMS (cepstral mean subtraction) which is widely adopted to normalize a channel variation for speech and speaker recognition. CMS which estimates the channel effects by averaging long-term cepstrum has a weak point that the estimated channel is biased by the formants of voiced speech which include a useful speech information. The proposed Formant-broadened Cepstral Mean Subtraction (FBCMS) is based on the facts that the formants can be found easily in log spectrum which is transformed from the cepstrum by fourier transform and the formants correspond to the dominant poles of all-pole model which is usually modeled vocal tract. The FBCMS evaluates only poles to be broadened from the log spectrum without polynomial factorization and makes a formant-broadened cepstrum by broadening the bandwidths of formant poles. We can estimate the channel cepstrum effectively by averaging formant-broadened cepstral coefficients. We performed the experiments to compare FBCMS with CMS, PFCMS using 4 simulated telephone channels. In the experiment of channel estimation, we evaluated the distance cepstrum of real channel from the cepstrum of estimated channel and found that we were able to get the mean cepstrum closer to the channel cepstrum due to an softening the bias of mean cepstrum to speech. In the experiment of text-independent speaker identification, we showed the result that the proposed method was superior than the conventional CMS and comparable to the pole-filtered CMS. Consequently, we showed the proposed method was efficiently able to normalize the channel variation based on the conventional CMS.

Feasibility of hearing aid gain self-adjustment using speech recognition (말소리 인지를 이용한 보청기 이득 자가 조절의 실현)

  • Yun, Donghyeon;Shen, Yi;Zhang, Zhuohuang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.76-86
    • /
    • 2022
  • Personal hearing devices, such as hearing aids, may be fine-tuned by allowing the users to conduct self-adjustment. Two self-adjustment procedures were developed to collect the listener preferred gains in six octave-frequency bands from 0.25 kHz to 8 kHz. These procedures were designed to allow rapid exploration of a multi-dimensional parameter space using a simple, one-dimensional user control interface (i.e., a programmable knob). The two procedures differ in whether the user interface controls the gains in all frequency bands simultaneously (Procedure A) or only the gain in one frequency band (Procedure B) on a given trial. Monte-Carlo simulations suggested that for both procedures the gain preference identified by simulated listeners rapidly converged to the ground-truth preferred gain profile over the first 20 trials. Initial behavioral evaluations of the self-adjustment procedures, in terms of test-retest reliability, were conducted using 20 young, normal-hearing listeners. Each estimate of the preferred gain profile took less than 20 minutes. The deviation between two separate estimates of the preferred gain profile, conducted at least a week apart, was about 10 dB ~ 15 dB.

A Comparison Study on the Speech Signal Parameters for Chinese Leaners' Korean Pronunciation Errors - Focused on Korean /ㄹ/ Sound (중국인 학습자의 한국어 발음 오류에 대한 음성 신호 파라미터들의 비교 연구 - 한국어의 /ㄹ/ 발음을 중심으로)

  • Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.239-246
    • /
    • 2017
  • This paper compares the speech signal parameters between Korean and Chinese for Korean pronunciation /ㄹ/, which is caused many errors by Chinese leaners. Allophones of /ㄹ/ in Korean is divided into lateral group and tap group. It has been investigated the reasons for these errors by studying the similarity and the differences between Korean /ㄹ/ pronunciation and its corresponding Chinese pronunciation. In this paper, for the purpose of comparison the speech signal parameters such as energy, waveform in time domain, spectrogram in frequency domain, pitch based on ACF, Formant frequencies are used. From the phonological perspective the speech signal parameters such as signal energy, a waveform in the time domain, a spectrogram in the frequency domain, the pitch (F0) based on autocorrelation function (ACF), Formant frequencies (f1, f2, f3, and f4) are measured and compared. The data, which are composed of the group of Korean words by through a philological investigation, are used and simulated in this paper. According to the simulation results of the energy and spectrogram, there are meaningful differences between Korean native speakers and Chinese leaners for Korean /ㄹ/ pronunciation. The simulation results also show some differences even other parameters. It could be expected that Chinese learners are able to reduce the errors considerably by exploiting the parameters used in this paper.

Channel Coding Design Combined with Source Coder for Mobile Communication Systems (이동통신시스템을 위한 소스 코더와 결합된 채널코딩 방법 연구)

  • 김종현;이인성강석봉이정구
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.19-22
    • /
    • 1998
  • In this study, the efficient channel coding method combined with CS-ACELP is proposed. The same convolutional coder and Viterbi decoder of COMA mobile communication system is used as channel coder. To make the best available use of limited channel coding redundancy, unequal error protection of punctured convolutional coder is used for variable reate allocation. But, the overall code rate is given by 2. The performance of proposed coder is analyzed and simulated in a Rayleigh fading channel. Experimental results show that the objective and subjective speech quality of variable rate channel coding methods are superior to those of non-variable channel coding method.

  • PDF

An Acoustic Feedback Canceller for Hearing Aids Using Improved Orthogonal Projection Algorithm (개선된 직교투사 알고리즘을 이용한 음향궤환제거기)

  • Lee, Haeng Woo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.2
    • /
    • pp.49-58
    • /
    • 2012
  • This paper is on an improved orthogonal projection method which can cancel the acoustic feedback signals in the digital hearing aids. Comparing with the NLMS algorithm which is widely used for simplicity and stability, it shows that this method has the improvement of the convergence performances, and has small computational quantities, for signals with the large auto-correlation as speech signals. This uses the improved orthogonal projection algorithm which reduces the correlation of signals. To verify the convergence characteristics of the proposed algorithm, we simulated about various input signals. The acoustic feedback canceller has a 12-bit resolution with 64-tap adaptive FIR filter. And we compared the results of simulation for this algorithm with the ones for the NLMS algorithm. By these works, it is proved that the feedback canceller adopting the proposed algorithm shows about 3.5dB more high SNR than the NLMS algorithm in the colored input signals.

Quantization of Line Spectrum Pair Frequencies using Lattice Vector Quantizers (격자벡터양자화기를 이용한 음성신호의 LSP 주파수 양자화)

  • 강정원;정재호;정대권
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.10
    • /
    • pp.2634-2644
    • /
    • 1996
  • Two different low rate speech coders using one of four types of lattice vector quantizers(LVQ's) with fairly low complexity were investigated for an application to mobile communications. More specifically, two-stage vector quantizer-lattic vector quantizer(VQ-LVQ) systems and vector differenctial pulse code modulation(VDPCM)systems with lattice vector quantizers simulated to encode the line spectrum frequencies of various sentences at the rate 22 to 39 bits per 20 msec frame. The simulation results showed that the VDPCM system with the lattice VQ can save up to 10 bits/fram compared to the quantization scheme used in QCELP system. For the VQ-LVQ system, the spherical quasi-uniform LVQ below 36 bits/frame outperformed the other 3 types of LVQ's and the pyramidal quasi-uniform LVQ at 37 bits/frame outperformed the other 3 types of LVQ's with the spectral distortion 0.97.

  • PDF

A study on the design of new floating resistor and it′s application (새로운 CMOS Floating저항의 설계와 그 응용에 대한연구)

  • 이영훈
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.76-83
    • /
    • 2000
  • The continuous time signal system by development of CMOS technology have been receiving consideration attention. In this paper, Low pass filter using new CMOS floating resistor have been designed with cut off frequency for speech signal Processing. Especially a new floating resistor consisting entirely of CMOS devices in saturation has been developed. Linearity within $\pm$0.04% is achieved through nonlinearity via current mirrors over an applied range of $\pm$1V The frequency response exceeds 10MHz, and the resistors are expected to be useful in implementing integrated circuit active RC filters. The low pass filter designed using this method has simpler structure than switched capacitofilter. So reduce the chip area. The characteristics of the designed low pass filter using this method are simulated by pspice program.

  • PDF

Simulation of the Loudness Recruitment using Sensorineural Hearing Impairment Modeling (감음신경성 난청의 모델링을 통한 라우드니스 누가현상의 시뮬레이션)

  • Kim, D.W.;Park, Y.C.;Kim, W.K.;Doh, W.;Park, S.J.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1997 no.11
    • /
    • pp.63-66
    • /
    • 1997
  • With the advent of high speed digital signal processing chips, new digital techniques have been introduced to the hearing instrument. This advanced hearing instrument circuitry has led to the need or and the development of new fitting approach. A number of different fitting approaches have been developed over the past few years, yet there has been little agreement on which approach is the "best" or most appropriate to use. However, when we develop not only new hearing aid, but also its fitting method, the intensive subject-based clinical tests are necessarily accompanied. In this paper, we present an objective method to evaluate and predict the performance of hearing aids without the help of such subject-based tests. In the hearing impairment simulation (HIS) algorithm, a sensorineural hearing impairment model is established from auditory test data of the impaired subject being simulated. Also, in the hearing impairment simulation system the abnormal loudness relationships created by recruitment was transposed to the normal dynamic span of hearing. The nonlinear behavior of the loudness recruitment is defined using hearing loss unctions generated from the measurements. The recruitment simulation is validated by an experiment with two impaired listeners, who compared processed speech in the normal ear with unprocessed speech in the impaired ear. To assess the performance, the HIS algorithm was implemented in real-time using a floating-point DSP.

  • PDF

Normalization of Spectral Magnitude and Cepstral Transformation for Compensation of Lombard Effect (롬바드 효과의 보정을 위한 스펙트럼 크기의 정규화와 켑스트럼 변환)

  • Chi, Sang-Mun;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.83-92
    • /
    • 1996
  • This paper describes Lombard effect compensation and noise suppression so as to reduce speech recognition error in noisy environments. Lombard effect is represented by the variation of spectral envelope of energy normalized word and the variation of overall vocal intensity. The variation of spectral envelope can be compensated by linear transformation in cepstral domain. The variation of vocal intensity is canceled by spectral magnitude normalization. Spectral subtraction is use to suppress noise contamination, and band-pass filtering is used to emphasize dynamic features. To understand Lombard effect and verify the effectiveness of the proposed method, speech data are collected in simulated noisy environments. Recognition experiments were conducted with contamination by noise from automobile cabins, an exhibition hall, telephone booths in down town, crowded streets, and computer rooms. From the experiments, the effectiveness of the proposed method has been confirmed.

  • PDF