• Title/Summary/Keyword: Speech sound

Search Result 628, Processing Time 0.03 seconds

Investigating the Effects of Hearing Loss and Hearing Aid Digital Delay on Sound-Induced Flash Illusion

  • Moradi, Vahid;Kheirkhah, Kiana;Farahani, Saeid;Kavianpour, Iman
    • Journal of Audiology & Otology
    • /
    • v.24 no.4
    • /
    • pp.174-179
    • /
    • 2020
  • Background and Objectives: The integration of auditory-visual speech information improves speech perception; however, if the auditory system input is disrupted due to hearing loss, auditory and visual inputs cannot be fully integrated. Additionally, temporal coincidence of auditory and visual input is a significantly important factor in integrating the input of these two senses. Time delayed acoustic pathway caused by the signal passing through digital signal processing. Therefore, this study aimed to investigate the effects of hearing loss and hearing aid digital delay circuit on sound-induced flash illusion. Subjects and Methods: A total of 13 adults with normal hearing, 13 with mild to moderate hearing loss, and 13 with moderate to severe hearing loss were enrolled in this study. Subsequently, the sound-induced flash illusion test was conducted, and the results were analyzed. Results: The results showed that hearing aid digital delay and hearing loss had no detrimental effect on sound-induced flash illusion. Conclusions: Transmission velocity and neural transduction rate of the auditory inputs decreased in patients with hearing loss. Hence, the integrating auditory and visual sensory cannot be combined completely. Although the transmission rate of the auditory sense input was approximately normal when the hearing aid was prescribed. Thus, it can be concluded that the processing delay in the hearing aid circuit is insufficient to disrupt the integration of auditory and visual information.

Reference Channel Input-Based Speech Enhancement for Noise-Robust Recognition in Intelligent TV Applications (지능형 TV의 음성인식을 위한 참조 잡음 기반 음성개선)

  • Jeong, Sangbae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.2
    • /
    • pp.280-286
    • /
    • 2013
  • In this paper, a noise reduction system is proposed for the speech interface in intelligent TV applications. To reduce TV speaker sound which are very serious noises degrading recognition performance, a noise reduction algorithm utilizing the direct TV sound as the reference noise input is implemented. In the proposed algorithm, transfer functions are estimated to compensate for the difference between the direct TV sound and that recorded with the microphone installed on the TV frame. Then, the noise power spectrum in the received signal is calculated to perform Wiener filter-based noise cancellation. Additionally, a postprocessing step is applied to reduce remaining noises. Experimental results show that the proposed algorithm shows 88% recognition rate for isolated Korean words at 5 dB input SNR.

Computer Codes for Korean Sounds: K-SAMPA

  • Kim, Jong-mi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.3-16
    • /
    • 2001
  • An ASCII encoding of Korean has been developed for extended phonetic transcription of the Speech Assessment Methods Phonetic Alphabet (SAMPA). SAMPA is a machine-readable phonetic alphabet used for multilingual computing. It has been developed since 1987 and extended to more than twenty languages. The motivating factor for creating Korean SAMPA (K-SAMPA) is to label Korean speech for a multilingual corpus or to transcribe native language (Ll) interfered pronunciation of a second language learner for bilingual education. Korean SAMPA represents each Korean allophone with a particular SAMPA symbol. Sounds that closely resemble it are represented by the same symbol, regardless of the language they are uttered in. Each of its symbols represents a speech sound that is spectrally and temporally so distinct as to be perceptually different when the components are heard in isolation. Each type of sound has a separate IPA-like designation. Korean SAMPA is superior to other transcription systems with similar objectives. It describes better the cross-linguistic sound quality of Korean than the official Romanization system, proclaimed by the Korean government in July 2000, because it uses an internationally shared phonetic alphabet. It is also phonetically more accurate than the official Romanization in that it dispenses with orthographic adjustments. It is also more convenient for computing than the International Phonetic Alphabet (IPA) because it consists of the symbols on a standard keyboard. This paper demonstrates how the Korean SAMPA can express allophonic details and prosodic features by adopting the transcription conventions of the extended SAMPA (X-SAMPA) and the prosodic SAMPA(SAMPROSA).

  • PDF

Voice Rehabilitation Other than Tracheo - Esophageal Shunt Method - (후두적출자의 음성재활 - 기관식도천자법 이외의 방법 -)

  • Kim, Young-Ho
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.19 no.1
    • /
    • pp.28-30
    • /
    • 2008
  • The problem of voice restoration after total laryngectomy has existed ever since Billroth's first total laryngectomy in 1873. Since then, all the efforts to restore the voice was tried to divert the tracheal air to the pharynx to produce voice, which became the tracheo-esophageal shunt voice currently used. With the intact pharyngoesophagus, however, there are two basic options for speech rehabilitation : the artificial larynx and esophageal voice. The artificial larynx is an electrically driven buzzer or a sound transducer and its most common type is placed against a supple point on patient's neck and introduces a mechanical sound into the tissues and air spaces of the neck. This sound, emanating form the mouth, is articulated by the intact structures of the remaining vocal tract as understandable speech. Esophageal voice is a commonly recommended method for alaryngeal speech rehabilitation, which can be successfully done by regurgitating the air stored in the esophagus. Successful esophageal voice is preferable to the artificial larynx but, most patients usually adapt only one of those methods according to their needs and feasibility to learn.

  • PDF

The Analysis of Eletroglottographic Measures of Vowel and Sentence in Korean Healthy Adults (한국 정상 성인의 모음과 문단 산출 시 전기성문파형 측정)

  • Kim, Jae-Ock
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.223-228
    • /
    • 2010
  • This study investigated the closed quotient and other voice quality parameters using electroglottography (EGG) in sustaining the vowel /a/ and reading a sentence at the comfortable pitch and loudness in healthy Korean adults. Seventy two healthy adults (36 men, 36 women) aged 20~40 years were included in the study. The tasks were recorded and analyzed using Lx Speech Studio. In vowel sustaining task, closed quotient (Qx), fundamental frequency (Fx), sound pressure level (SPL), Jitter, and Shimmer were measured. In sentence reading task, closed quotient (DQx), fundamental frequency (DFx), and sound pressure level (DAx) were measured. The sex effects were observed on Qx, Fx, Shimmer, DQx, and DFx. Men had significantly higher Qx and DQx than women, but had significantly lower Shimmer than women. However, there was no sex effect on Jitter. The task effects on Qx and SPL as well as DQx and DAx were also assessed. Qx and SPL were significantly higher than DQx and DAx in both gender. This study showed that the closed quotients in both vowel sustaining and sentence reading tasks were significantly related to other voice quality parameters. Therefore, clinicians and researchers should describe the voice quality parameters like fundamental frequency, sound pressure level, Jitter, Shimmer, and so on when reporting closed quotients using EGG.

  • PDF

The research on the MEMS device improvement which is necessary for the noise environment in the speech recognition rate improvement (잡음 환경에서 음성 인식률 향상에 필요한 MEMS 장치 개발에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1659-1666
    • /
    • 2018
  • When the input sound is mixed voice and sound, it can be seen that the voice recognition rate is lowered due to the noise, and the speech recognition rate is improved by improving the MEMS device which is the H / W device in order to overcome the S/W processing limit. The MEMS microphone device is a device for inputting voice and is implemented in various shapes and used. Conventional MEMS microphones generally exhibit excellent performance, but in a special environment such as noise, there is a problem that the processing performance is deteriorated due to a mixture of voice and sound. To overcome these problems, we developed a newly designed MEMS device that can detect the voice characteristics of the initial input device.

Intelligent Speech Recognition System based on Situation Awareness for u-Green City (u-Green City 구현을 위한 상황인지기반 지능형 음성인식 시스템)

  • Cho, Young-Im;Jang, Sung-Soon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.12
    • /
    • pp.1203-1208
    • /
    • 2009
  • Green IT based u-City means that u-City having Green IT concept. If we adopt the situation awareness or not, the processing of Green IT may be reduced. For example, if we recognize a lot of speech sound on CCTV in u-City environment, it takes a lot of processing time and cost. However, if we want recognize emergency sound on CCTV, it takes a few reduced processing cost. So, for detecting emergency state dynamically through CCTV, we propose our advanced speech recognition system. For the purpose of that, we adopt HMM (Hidden Markov Model) for feature extraction. Also, we adopt Wiener filter technique for noise elimination in many information coming from on CCTV in u-City environment.

Improvement of Bit Rate applying the Speaking Rate and PSOLA Technique of Speech in CELP Vocoder (음성신호의 발성율과 PSOLA기법을 적용한 음성 보코더 전송률 개선에 관한 연구)

  • 장경아;서지호;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.45-48
    • /
    • 2003
  • In general, speech coding methods are classified into the following three categories: the waveform coding, the source coding and the hybrid coding. Fast speaking is possible to encode with a few information compared with slow speaking rate. In case of speaking rate, low frequency band is more important than high frequency band while listening. Speech vocoding technique is developing to way with low bit rate and complexity and high sound quality. the CELP type of vocoder support very good sound quality with low bit rate but these vocoders don't consider about the speaking rate. When we consider speaking rate and encode the frame depending on the speaking rate, the bit rate is able to reduce the bit rate than the conventional vocoder. We propose the technique to estimate the speaking rate and applied PSOLA technique in case of the frame of slow speaking rate. As a result of simulation bit rate can be reduced about 300 bps.

  • PDF

Voice Expression using a Cochlear Filter Model

  • Jarng, Soon-Suck
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.1E
    • /
    • pp.20-28
    • /
    • 1996
  • Speech sounds were practically applied to a cochlear filter which was simulated by an electrical transmission line. The amplitude of the basilar membrane displacement was calculated along the length of the cochlea in temporal response. And the envelope of the amplitude according to the length was arranged for each discrete time interval. The resulting time response of the speech sound was then displayed as a color image. Five vowels such as a, e, I, o, u were applied and their results were compared. The whole procedure of the visualization method of the speech sound using the cochlear filter is described in detail. The filter model response to voice is visualized by passing the voice through the cochlear filter model.

  • PDF

Speech and Aerodynamics Measurement System for Laryngeal Function Assessment (후두기능 진단을 위한 음성 및 유체역학적 측정장치 개발)

  • Lee, J.S.;Park, K.S.;Sung, M.H.;Kim, K.H.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1996 no.11
    • /
    • pp.37-39
    • /
    • 1996
  • For the laryngeal function assessment, we developed a measurement system of speech sound and air flow rate. The speech sound was transduced with a standard omnidirectional condenser microphone and orally emitted air flow was measured with pneumotachometer using differential pressure transducer.

  • PDF