• Title/Summary/Keyword: 원거리 음성

Search Result 56, Processing Time 0.022 seconds

An Implementation of Automobile Information System using VoiceXML (VoiceXML을 이용한 자동차 정보 안내 시스템 구현)

  • Yang, Jung-Su;Kim, Dong-Gyu;Kim, Jung-Hyun;Roh, Yong-Wan;Hong, Kwang-Seok
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2005.11a
    • /
    • pp.290-293
    • /
    • 2005
  • 음성 인식 기술이 발달함에 따라 음성 인식 기술을 이용한 응용의 개발이 중요한 문제로 떠오르고 있다. VoiceXML은 전화기를 통한 음성 인터페이스를 위한 XML 언어로서 손쉬운 방법으로서 음성 인터페이스를 설계, 구현할 수 있도록 만들어진 언어이다. 본 논문에서는 이를 이용해 전화를 통하여 음성으로 자동차 정보 안내 시스템을 사용할 수 있는 사용자 인터페이스를 구현한다. 구현된 시스템 및 서비스는 VoiceXML의 장점을 활용하여 원거리에서 편리하게 사용자가 자동차의 정보를 안내받고 제어할 수 있는 인터페이스 자체보다는 음성 인터페이스의 설계 및 구현에 중점을 두었다. 10인의 피실험자가 각 10회씩 총 100회를 실험한 결과 99.3%의 인식률을 보였다. 추후 차세대 자동차 텔레메틱스 서비스와 연동하면 구현되어진 시스템의 활용이 증대될 것이라 판단된다.

  • PDF

Acoustic Characteristics of Vowels in Korean Distant-Talking Speech (한국어 원거리 음성의 모음의 음향적 특성)

  • Lee Sook-hyang;Kim Sunhee
    • MALSORI
    • /
    • v.55
    • /
    • pp.61-76
    • /
    • 2005
  • This paper aims to analyze the acoustic effects of vowels produced in a distant-talking environment. The analysis was performed using a statistical method. The influence of gender and speakers on the variation was also examined. The speech data used in this study consist of 500 distant-talking words and 500 normal words of 10 speakers (5 males and 5 females). Acoustic features selected for the analysis were the duration, the formants (Fl and F2), the fundamental frequency and the total energy. The results showed that the duration, F0, F1 and the total energy increased in the distant-talking speech compared to normal speech; female speakers showed higher increase in all features except for the total energy and the fundamental frequency. In addition, speaker differences were observed.

  • PDF

MLLR-Based Environment Adaptation for Distant-Talking Speech Recognition (원거리 음성인식을 위한 MLLR적응기법 적용)

  • Kwon, Suk-Bong;Ji, Mi-Kyong;Kim, Hoi-Rin;Lee, Yong-Ju
    • MALSORI
    • /
    • no.53
    • /
    • pp.119-127
    • /
    • 2005
  • Speech recognition is one of the user interface technologies in commanding and controlling any terminal such as a TV, PC, cellular phone etc. in a ubiquitous environment. In controlling a terminal, the mismatch between training and testing causes rapid performance degradation. That is, the mismatch decreases not only the performance of the recognition system but also the reliability of that. Therefore, the performance degradation due to the mismatch caused by the change of the environment should be necessarily compensated. Whenever the environment changes, environment adaptation is performed using the user's speech and the background noise of the changed environment and the performance is increased by employing the models appropriately transformed to the changed environment. So far, the research on the environment compensation has been done actively. However, the compensation method for the effect of distant-talking speech has not been developed yet. Thus, in this paper we apply MLLR-based environment adaptation to compensate for the effect of distant-talking speech and the performance is improved.

  • PDF

A Study on the Durational Characteristics of Korean Distant-Talking Speech (한국어 원거리 음성의 지속시간 연구)

  • Kim, Sun-Hee
    • MALSORI
    • /
    • no.54
    • /
    • pp.1-14
    • /
    • 2005
  • This paper presents durational characteristics of Korean distant-talking speech using speech data, which consist of 500 distant-talking utterances and 500 normal utterances of 10 speakers (5 males and 5 females). Each file was segmented and labeled manually and the duration of each segment and each word was extracted. Using a statistical method, the durational change of distant-talking speech in comparison with normal speech was analyzed. The results show that the duration of words with distant-talking speech is increased in comparison with normal style, and that the average unvoiced consonantal duration is reduced while the average vocalic duration is increased. Female speakers show a stronger tendency towards lengthening the duration in distant-talking speech. Finally, this study also shows that the speakers of distant-talking speech could be classified according to their different duration rate.

  • PDF

Prosodic Characteristics of Korean Distance Speech (한국어 원거리 음성의 운율적 특성)

  • Lee, Sook-hyang;Kim, Sun-Hee;Kim, Jong-Jin
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.87-90
    • /
    • 2005
  • The aim of this paper is to investigate the prosodic characteristics of Korean distant speech. 36 2-syllable words of 4 speakers (2 males and 2 females) produced in both distant-talking and normal environments were used. The results showed that ratios of second syllable to first syllable in vowel duration and vowel energy were significantly larger in the distant-talking environment compared to the normal environment and f0 range also bigger in the distant-talking environment. In addition, 'HL%' contour boundary tone in the second syllable and/or 'L +H' contour tone in the first syllable were used in the distant-talking environment.

  • PDF

Remote speech recognition preprocessing system for intelligent robot in noisy environment (지능로봇에 적합한 잡음 환경에서의 원거리 음성인식 전처리 시스템)

  • Gwon, Se-Do;Jeong, Hong
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.365-366
    • /
    • 2006
  • This paper describes a pre-processing methodology which can apply to remote speech recognition system of service robot in noisy environment. By combining beamforming and blind source separation, we can overcome the weakness of beamforming (reverberation) and blind source separation (distributed noise, permutation ambiguity). As this method is designed to be implemented with hardware, we can achieve real-time execution with FPGA by using systolic array architecture.

  • PDF

Sound Source Localization Technique at a Long Distance for Intelligent Service Robot (지능형 서비스 로봇을 위한 원거리 음원 추적 기술)

  • Lee Ji-Yeoun;Hahn Min-Soo
    • MALSORI
    • /
    • no.57
    • /
    • pp.85-97
    • /
    • 2006
  • This paper suggests an algorithm that can estimate the direction of the sound source in real time. The algorithm uses the time difference and sound intensity information among the recorded sound source by four microphones. Also, to deal with noise of robot itself, the Kalman filter is implemented. The proposed method can take shorter execution time than that of an existing algorithm to fit the real-time service robot. Also, using the Kalman filter, signal ratio relative to background noise, SNR, is approximately improved to 8 dB. And the estimation result of azimuth shows relatively small error within the range of ${\pm}7$ degree.

  • PDF

Decision Rule using Confidence Based Anti-phone Model and Interrupt-Polling Method for Distributed Speech Recognition DSP Networking System (분산형 음성인식 DSP 네트워킹 시스템을 위한 반음소 모델기반의 신뢰도를 사용한 결정규칙과 인터럽트-폴링)

  • Song, Ki-Chang;Kang, Chul-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.7
    • /
    • pp.1016-1022
    • /
    • 2010
  • Far-talking recognition and distributed speech recognition networking techniques are essential to control various and complex home services conveniently with voices. It is possible to control devices everywhere at home by using only voices. In this paper, we have developed the server-client DSP module for distributed speech recognition network system and proposed a new decision rule to decide intelligently whether to accept the recognition results or not by the transferred confidence rate. Simulation results show that the proposed decision rule delivers better performances than the conventional decision by majority rule or decision by first-arrival. Also, we have proposed the new interrupt-polling technique to remedy the defect of existing delay technique which always has to wait several clients' results for a few seconds. The proposed technique queries all client's status after first-arrival and decides whether to wait or not. It can remove unnecessary delay-time without any performance degradation.

A Study on portable voice recording prevention device (휴대용 음성 녹음 방지 장치 연구)

  • Kim, Hee-Chul
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.209-215
    • /
    • 2021
  • This study is a system development for voice information protection equipment in major meetings and places requiring security. Security performance and stability were secured with information leakage prevention technology through generation of false noise and ultrasonic waves. The cutoff frequency band for blocking the leakage of voice information, which has strong straightness due to the nature of the radio wave to the recording prevention module, blocks the wideband frequency of 20~20,000Hz, and the deception jamming technology is applied to block the leakage of voice information, greatly improving the security. To solve this problem, we developed a system that blocks the recording of a portable smartphone using a battery, and made the installation of a separate device smaller and lighter so that customers do not recognize it. In addition, it is necessary to continuously study measures and countermeasures for efficiently using the output of the anti-recording speaker for long-distance recording prevention.

An Adaptive Microphone Array with Linear Phase Response (선형 위상 특성을 갖는 적응 마이크로폰 어레이)

  • Kang, Hong-Gu;Youn, Dae-Hui;Cha, Il-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.53-60
    • /
    • 1992
  • Many adaptive beamforming methods have been studied for interference cancellation and speech signal enhancement in telephone conference and auditorium. Main aspect of adaptive beamforming methods for speech signal processing is different from radar, sonar and seismic signal processing because desire output signal should be apt to the human ear. Considering that phase of speech is quite insensible to the human ear, Sondhi proposed a nonlinear constrained optimization technique whose constraint was on the magnitude transfer function from the source to the output. In real environment the phase response of the speech signal affects the human auditorium system. So it is desirable to design linear phase system. In this paper, linear phase beamformer is proposed and sample processing algorithm is also proposed for real time consideration Simulation results show that the proposed algorithm yields more consistent beam patterns and deep nulls to the noise direction than Sondhi's.

  • PDF