• Title/Summary/Keyword: digital speech signal

Search Result 136, Processing Time 0.022 seconds

Design of digital DBNN for pattern recoginition (패턴인식을 위한 디지탈 DBNN의 설계)

  • 송창영;문성룡;김환용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.11
    • /
    • pp.3001-3011
    • /
    • 1996
  • In this paper, using DBNN algorithm which is used in the binary pattern classification or speech signal processing the digital DBNN circuit is designed having the variable expansion depending the size of input data and pattern type. The processing elemen(PE) of the proposed network consists of the synapse and MAXNET circuits for the similarity measurement between reference and input pattern. Global MAXNET selects the global winner among the local winners which is selected in each PE. Through the several simultions, and thus each PE and global MAXNET search the reference pattern that was the most simlar to input pattern for the discord of the pattern.

  • PDF

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

  • 윤원중;이강규;박규식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.271-278
    • /
    • 2004
  • In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.

Gender Analysis in Elderly Speech Signal Processing (노인음성신호처리에서의 젠더 분석)

  • Lee, JiYeoun
    • Journal of Digital Convergence
    • /
    • v.16 no.10
    • /
    • pp.351-356
    • /
    • 2018
  • Changes in vocal cords due to aging can change the frequency of speech, and the speech signals of the elderly can be automatically distinguished from normal speech signals through various analyzes. The purpose of this study is to provide a tool that can be easily accessed by the elderly and disabled people who can be excluded from the rapidly changing technological society and to improve the voice recognition performance. In the study, the gender of the subjects was reported as sex analysis, and the number of female and male voice samples was used equally. In addition, the gender analysis was applied to set the voices of the elderly without using voices of all ages. Finally, we applied a review methodology of standards and reference models to reduce gender difference. 10 Korean women and 10 men aged 70 to 80 years old are used in this study. Comparing the F0 value extracted directly with the waveform and the F0 extracted with TF32 and the Wavesufer speech analysis program, Wavesufer analyzed the F0 of the elderly voice better than TF32. However, there is a need for a voice analysis program for elderly people. In conclusions, analyzing the voice of the elderly will improve speech recognition and synthesis capabilities of existing smart medical systems.

Gendered innovation for algorithm through case studies (음성·영상 신호 처리 알고리즘 사례를 통해 본 젠더혁신의 필요성)

  • Lee, JiYeoun;Lee, Heisook
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.459-466
    • /
    • 2018
  • Gendered innovations is a term used by policy makers and academics to refer the process of creating better research and development (R&D) for both men and women. In this paper, we analyze the literatures in image and speech signal processing that can be used in ICT, examine the importance of gendered innovations through case study. Therefore the latest domestic and foreign literature related to image and speech signal processing based on gender research is searched and a total of 9 papers are selected. In terms of gender analysis, research subjects, research environment, and research design are examined separately. Especially, through the case analysis of algorithms of the elderly voice signal processing, machine learning, machine translation technology, and facial gender recognition technology, we found that there is gender bias in existing algorithms, and which leads to gender analysis is required. We also propose a gendered innovations method integrating sex and gender analysis in algorithm development. Gendered innovations in ICT can contribute to the creation of new markets by developing products and services that reflect the needs of both men and women.

Speech Quality Measure for VoIP Using Wavelet Based Bark Coherence Function (웨이블렛 기반 바크 코히어런스 함수를 이용한 VoIP 음질평가)

  • 박상욱;박영철;윤대희
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.4A
    • /
    • pp.310-315
    • /
    • 2002
  • The Bark Coherence Function (BCF) defies a coherence function within perceptual domain as a new cognition module, robust to linear distortions due to the analog interface of digital mobile system. Our previous experiments have shown the superiority of BCF over current measures. In this paper, a new BCF suitable for VoIP is developed. The unproved BCF is based on the wavelet series expansion that provides good frequency resolution while keeping good time locality. The proposed Wavelet based Bark Coherence function (WBCF) is robust to variable delay often observed in packet-based telephony such as Voice over Internet Protocol (VoIP). We also show that the refinement of time synchronization after signal decomposition can improve the performance of the WBCF. The regression analysis was performed with VoIP speech data. The correlation coefficients and the standard error of estimates computed using the WBCF showed noticeable improvement over the Perceptual Speech Quality Measure (PSQM) that is recommended by ITU-T.

Auto Frame Extraction Method for Video Cartooning System (동영상 카투닝 시스템을 위한 자동 프레임 추출 기법)

  • Kim, Dae-Jin;Koo, Ddeo-Ol-Ra
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.12
    • /
    • pp.28-39
    • /
    • 2011
  • While the broadband multimedia technologies have been developing, the commercial market of digital contents has also been widely spreading. Most of all, digital cartoon market like internet cartoon has been rapidly large so video cartooning continuously has been researched because of lack and variety of cartoon. Until now, video cartooning system has been focused in non-photorealistic rendering and word balloon. But the meaningful frame extraction must take priority for cartooning system when applying in service. In this paper, we propose new automatic frame extraction method for video cartooning system. At frist, we separate video and audio from movie and extract features parameter like MFCC and ZCR from audio data. Audio signal is classified to speech, music and speech+music comparing with already trained audio data using GMM distributor. So we can set speech area. In the video case, we extract frame using general scene change detection method like histogram method and extract meaningful frames in the cartoon using face detection among the already extracted frames. After that, first of all existent face within speech area image transition frame extract automatically. Suitable frame about movie cartooning automatically extract that extraction image transition frame at continuable period of time domain.

Study of the Noise Processing to Technique Speech Recognition System (음성인식 시스템에서의 잡음 제거 개선에 관한 연구)

  • 이창윤;이영훈
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.2
    • /
    • pp.73-78
    • /
    • 2002
  • Recognition system of noise processing technique. A method combining SNR normalization with RAS is considered as a noise Processing and the performance of the speech recognition system can be improved using other noise processing technique. Experiment of recognition system is the internal organs that using a general digital signal processor(TMS320C31). Recognition word set is composed of 60 command words for of Rce environment and order of computer. Simulation is considered as a colored noise of general environment. The results of experiment showed that the recognition word set gives 94.61% of efficiency of recognition at maximum in case of the combination of SNR normalization and spectral subtraction.

  • PDF

Experimental Results of SSB Modem in Shallow Sea (천해에서 SSB 모뎀의 실험결과 분석)

  • Ju, Hyng-Jun;Han, Jung-Woo;Kim, Ki-Man
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.990-998
    • /
    • 2008
  • In this paper we achieve experimental data evaluation using SSB(Single-side band) modulation in the ocean. Present research in underwater communication is applying digital modulation, OFDM and MIMO system. However, Commercial modems using analog modulation techniques in oceans. So, we achieved experimental for modem appliance development of correct high quality in South Korea sea characteristics. This experimets achievd useing SSB analog modulation in Jin-hae shore of shallow water condition. Used data are tonal and LFM signal for getting underwater channel characterisitcs and female Korean speech for speech communications.

An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree (난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가)

  • Joo, S.I.;Jeon, Y.Y.;Song, Y.R.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.3 no.1
    • /
    • pp.27-34
    • /
    • 2009
  • Hearing impaired person's hearing loss has personally various shape, so existing symmetrical auditory filter of frequency band method wasn't properly simulated the hearing impaired person's various hearing loss shape. The shapes of auditory filter are asymmetrical different with each center frequency and each input level. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. In this study, the asymmetrical auditory filter was simulated and then some tests to estimate the filter's performance objectively were performed. The experiment as simulated auditory filter's performance evaluation method used perceptual evaluation of speech quality (PESQ) and log likelihood ratio (LLR) for speech through auditory filter. In the test, processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference between symmetrical and asymmetrical auditory filter. It means that the difference of the shape auditory filter may affect to speech quality. Especially, when hearing loss existed, auditory filter changing according to asymmetrical shape for each center frequency affected to perceive speech quality of the hearing impaired.

  • PDF

A Contrast Enhancement Method using the Contrast Measure in the Laplacian Pyramid for Digital Mammogram (디지털 맘모그램을 위한 라플라시안 피라미드에서 대비 척도를 이용한 대비 향상 방법)

  • Jeon, Geum-Sang;Lee, Won-Chang;Kim, Sang-Hee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.2
    • /
    • pp.24-29
    • /
    • 2014
  • Digital mammography is the most common technique for the early detection of breast cancer. To diagnose the breast cancer in early stages and treat efficiently, many image enhancement methods have been developed. This paper presents a multi-scale contrast enhancement method in the Laplacian pyramid for the digital mammogram. The proposed method decomposes the image into the contrast measures by the Gaussian and Laplacian pyramid, and the pyramid coefficients of decomposed multi-resolution image are defined as the frequency limited local contrast measures by the ratio of high frequency components and low frequency components. The decomposed pyramid coefficients are modified by the contrast measure for enhancing the contrast, and the final enhanced image is obtained by the composition process of the pyramid using the modified coefficients. The proposed method is compared with other existing methods, and demonstrated to have quantitatively good performance in the contrast measure algorithm.