• Title/Summary/Keyword: Speech signal processing

Search Result 331, Processing Time 0.03 seconds

Denoising Algorithm using Wavelet and Element Deviation-based Median Filter (웨이브렛과 원소 편차 기반의 중간값 필터를 이용한 잡음제거 알고리즘)

  • Bae, Sang-Bum;Kim, Nam-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.12
    • /
    • pp.2798-2804
    • /
    • 2010
  • The audio and image signal are corrupted by various noises in signal processing, many studies are being accomplished to restore those signals. In this paper, the algorithm is proposed to remove additive Gaussian noise and impulse noise at one dimension signal like an speech signal. The algorithm is composed to remove Gaussian noise after removing impulse noise. And the method using wavelet coefficient accumulation is used to remove the Gaussian noise, and the median filter based on element deviation is applied to remove the impulse noise. Also we compare existing methods using SNR(signal-to-noise ratio) as the standard of judgement of improvemental effect.

Future Trends of AI-Based Smart Systems and Services: Challenges, Opportunities, and Solutions

  • Lee, Daewon;Park, Jong Hyuk
    • Journal of Information Processing Systems
    • /
    • v.15 no.4
    • /
    • pp.717-723
    • /
    • 2019
  • Smart systems and services aim to facilitate growing urban populations and their prospects of virtual-real social behaviors, gig economies, factory automation, knowledge-based workforce, integrated societies, modern living, among many more. To satisfy these objectives, smart systems and services must comprises of a complex set of features such as security, ease of use and user friendliness, manageability, scalability, adaptivity, intelligent behavior, and personalization. Recently, artificial intelligence (AI) is realized as a data-driven technology to provide an efficient knowledge representation, semantic modeling, and can support a cognitive behavior aspect of the system. In this paper, an integration of AI with the smart systems and services is presented to mitigate the existing challenges. Several novel researches work in terms of frameworks, architectures, paradigms, and algorithms are discussed to provide possible solutions against the existing challenges in the AI-based smart systems and services. Such novel research works involve efficient shape image retrieval, speech signal processing, dynamic thermal rating, advanced persistent threat tactics, user authentication, and so on.

A Study on LMS-MPC Method Considering Low Bit Rate (Low Bit Rate을 고려한 LMS-MPC 방식에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.10 no.5
    • /
    • pp.233-238
    • /
    • 2012
  • In a speech coding system using excitation source of voiced and unvoiced, it would be a distortion of speech waveform in case of exist a voiced and an unvoiced consonants in a frame. To solve this problem, this paper present a method of LMS-MPC uses individual pitch and LMS(Least Mean Square). I evaluate the MPC and LMS-MPC using LMS. As a result, SNRseg of LMS-MPC was improved 1.5dB for female voice and 1.3dB for male voice respectively. Compared to the MPC, SNRseg of LMS-MPC has been improved that I was able to control the distortion of the speech waveform finally. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

The Vocabulary Recognition Optimize using Acoustic and Lexical Search (음향학적 및 언어적 탐색을 이용한 어휘 인식 최적화)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.4
    • /
    • pp.496-503
    • /
    • 2010
  • Speech recognition system is developed of standalone, In case of a mobile terminal using that low recognition rate represent because of limitation of memory size and audio compression. This study suggest vocabulary recognition highest performance improvement system for separate acoustic search and lexical search. Acoustic search is carry out in mobile terminal, lexical search is carry out in server processing system. feature vector of speech signal extract using GMM a phoneme execution, recognition a phoneme list transmission server using Lexical Tree Search algorithm lexical search recognition execution. System performance as a result of represent vocabulary dependence recognition rate of 98.01%, vocabulary independence recognition rate of 97.71%, represent recognition speed of 1.58 second.

Pitch Detection Using Variable Bandwidth LPF (가변 대역폭 LPF를 이용한 피치 검출)

  • Keum, Hong;Baek, Guem-Ran;Bae, Myung-Jin;Jang, Ho-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.5
    • /
    • pp.77-82
    • /
    • 1994
  • In speech signal processing, it is very important to detect the pitch exactly. Although various methods for detecting the pitch of speech signals have been developed, it is difficult to exactly extract the pitch for wide range of speakers and various utterances. Thus we propose a new pitch detection algorithm which takes advantage of the G-peak extraction. It is a method to detect the pitch period of the voiced signals by finding MZCI (maximum zero-crossing interval) of the G-peak which is defined as cut-off bandwidth rate of LPF (low pass filter). This algorithm performs robustly with a gross error rate of 3.63% even in 0 dB SNR environement. The gross error rate for clean speech is only 0.18%. Also it is able to process all courses with high speed.

  • PDF

Paper Title : Speech Parameter Estimation and Enhancement Using the EM Algorithm (EM 알고리즘을 이용한 음성 파라미터 추정 및 향상)

  • Lee, Ki-Yong;Kang, Young-Tae;Lee, Byung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.2E
    • /
    • pp.68-75
    • /
    • 1994
  • In many applications of signal processing, we have to deal with densities which are highly non-Gaussian or which may have Gaussian shape in the middle but have potent deviations in the tails. To fight against these deviations, we consider a finite mixture distribution for the speech excitation. We utilize the EM algorithm for the estimation of speech parameters and their enhancement. Robust Kalman filtering is used in the enhancement process, and a detection/estimation technique is used for parameter estimation. Experimental results show that the proposed algorithm performs better in adverse SNR input conditions.

  • PDF

Sasang Constitution Classification by Speech Signal Processing (음성 신호 분석에 의한 사상 체질 분류)

  • Cho Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.5C
    • /
    • pp.548-555
    • /
    • 2006
  • This paper proposes on the Sasang constitution classification method which is the most important things in the Sasang constitution medicine. Pre-existing methods of Sasang constitution classification are a shape of the body and its countenance & morpological aspect and temper. Many diagnostic methods have been developed and used including the questionnaires on personal life style and propensities(QSCC, QSCC II), and the tonal analysis of person's voice. Recently the constitutional acupunture and the herbal medicine response analyses are developed and used additionally. But these methods which is done by the doctor's intuition. In this article, I propose a methodology to classify the Sasang constitution. pitch, intensity and formants are used to classify the Sasang constitution by comparing the similarities and differencies of tonal analysis. Finally, the validity of the method is proven through the experiments.

The Study on the Speaker Adaptation Using Speaker Characteristics of Phoneme (음소에 따른 화자특성을 이용한 화자적응방법에 관한 연구)

  • 채나영;황영수
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2003.06a
    • /
    • pp.6-9
    • /
    • 2003
  • In this paper, we studied on the difference of speaker adaptation according to the phoneme classification for Korean Speech recognition. In order to study of speech adaptation according to the weight of difference of phoneme as recognition unit, we used SCHMM as recognition system. And Speaker adaptation method used in this paper was MAPE(Maximum A Posteriori Probability Estimation), Linear Spectral Estimation. In order to evaluate the performance of these methods, we used 10 Korean isolated numbers as the experimental data. It is possible for the first and the second methods to be carried out unsupervised learning and used in on-line system. And the first method was shown performance improvement over the second method, and hybrid adaptation showed the better recognition results than those which performed each method. And the result of Speaker adaptation using the variable weight according to the phoneme had better than the result using fixed weight.

  • PDF

Investigating the Effects of Hearing Loss and Hearing Aid Digital Delay on Sound-Induced Flash Illusion

  • Moradi, Vahid;Kheirkhah, Kiana;Farahani, Saeid;Kavianpour, Iman
    • Korean Journal of Audiology
    • /
    • v.24 no.4
    • /
    • pp.174-179
    • /
    • 2020
  • Background and Objectives: The integration of auditory-visual speech information improves speech perception; however, if the auditory system input is disrupted due to hearing loss, auditory and visual inputs cannot be fully integrated. Additionally, temporal coincidence of auditory and visual input is a significantly important factor in integrating the input of these two senses. Time delayed acoustic pathway caused by the signal passing through digital signal processing. Therefore, this study aimed to investigate the effects of hearing loss and hearing aid digital delay circuit on sound-induced flash illusion. Subjects and Methods: A total of 13 adults with normal hearing, 13 with mild to moderate hearing loss, and 13 with moderate to severe hearing loss were enrolled in this study. Subsequently, the sound-induced flash illusion test was conducted, and the results were analyzed. Results: The results showed that hearing aid digital delay and hearing loss had no detrimental effect on sound-induced flash illusion. Conclusions: Transmission velocity and neural transduction rate of the auditory inputs decreased in patients with hearing loss. Hence, the integrating auditory and visual sensory cannot be combined completely. Although the transmission rate of the auditory sense input was approximately normal when the hearing aid was prescribed. Thus, it can be concluded that the processing delay in the hearing aid circuit is insufficient to disrupt the integration of auditory and visual information.

Phonetic Acoustic Knowledge and Divide And Conquer Based Segmentation Algorithm (음성학적 지식과 DAC 기반 분할 알고리즘)

  • Koo, Chan-Mo;Wang, Gi-Nam
    • The KIPS Transactions:PartB
    • /
    • v.9B no.2
    • /
    • pp.215-222
    • /
    • 2002
  • This paper presents a reliable fully automatic labeling system which fits well with languages having well-developed syllables such as in Korean. The ASL System utilize DAC (Divide and Conquer), a control mechanism, based segmentation algorithm to use phonetic and acoustic information with greater efficiency. The segmentation algorithm is to devide speech signals into speechlets which is localized speech signal pieces and to segment each speechlet for speech boundaries. While HMM method has uniform and definite efficiencies, the suggested method gives framework to steadily develope and improve specified acoustic knowledges as a component. Without using statistical method such as HMM, this new method use only phonetic-acoustic information. Therefore, this method has high speed performance, is consistent extending the specific acoustic knowledge component, and can be applied in efficient way. we show experiment result to verify suggested method at the end.