• Title/Summary/Keyword: digital speech signal

Search Result 136, Processing Time 0.031 seconds

Implementation of 16Kpbs ADPCM by DSK50 (DSK50을 이용한 16kbps ADPCM 구현)

  • Cho, Yun-Seok;Han, Kyong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1996.07b
    • /
    • pp.1295-1297
    • /
    • 1996
  • CCITT G.721, G.723 standard ADPCM algorithm is implemented by using TI's fixed point DSP start kit (DSK). ADPCM can be implemented on a various rates, such as 16K, 24K, 32K and 40K. The ADPCM is sample based compression technique and its complexity is not so high as the other speech compression techniques such as CELP, VSELP and GSM, etc. ADPCM is widely applicable to most of the low cost speech compression application and they are tapeless answering machine, simultaneous voice and fax modem, digital phone, etc. TMS320C50 DSP is a low cost fixed point DSP chip and C50 DSK system has an AIC (analog interface chip) which operates as a single chip A/D and D/A converter with 14 bit resolution, C50 DSP chip with on-chip memory of 10K and RS232C interface module. ADPCM C code is compiled by TI C50 C-compiler and implemented on the DSK on-chip memory. Speech signal input is converted into 14 bit linear PCM data and encoded into ADPCM data and the data is sent to PC through RS232C. The ADPCM data on PC is received by the DSK through RS232C and then decoded to generate the 14 bit linear PCM data and converted into the speech signal. The DSK system has audio in/out jack and we can input and out the speech signal.

  • PDF

Digital Isolated Word Recognition System based on MFCC and DTW Algorithm (MFCC와 DTW에 알고리즘을 기반으로 한 디지털 고립단어 인식 시스템)

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.290-291
    • /
    • 2008
  • The most popular speech feature used in speech recognition today is the Mel-Frequency Cepstral Coefficients (MFCC) algorithm, which could reflect the perception characteristics of the human ear more accurately than other parameters. This paper adopts MFCC and its first order difference, which could reflect the dynamic character of speech signal, as synthetical parametric representation. Furthermore, we quote Dynamic Time Warping (DTW) algorithm to search match paths in the pattern recognition process. We use the software "GoldWave" to record English digitals in the lab environments and the simulation results indicate the algorithm has higher recognition accuracy than others using LPCC, etc. as character parameters in the experiment for Digital Isolated Word Recognition (DIWR) system.

  • PDF

VHDL Implementation of an LPC Analysis Algorithm (LPC 분석 알고리즘의 VHDL 구현)

  • 선우명훈;조위덕
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.1
    • /
    • pp.96-102
    • /
    • 1995
  • This paper presents the VHSIC Hardware Description Language(VHDL) implementation of the Fixed Point Covariance Lattice(FLAT) algorithm for an Linear Predictive Coding(LPC) analysis and its related algorithms, such as the forth order high pass Infinite Impulse Response(IIR) filter, covariance matrix calculation, and Spectral Smoothing Technique(SST) in the Vector Sum Exited Linear Predictive(VSELP) speech coder that has been Selected as the standard speech coder for the North America and Japanese digital cellular. Existing Digital Signal Processor(DSP) chips used in digital cellular phones are derived from general purpose DSP chips, and thus, these DSP chips may not be optimal and effective architectures are to be designed for the above mentioned algorithms. Then we implemented the VHDL code based on the C code, Finally, we verified that VHDL results are the same as C code results for real speech data. The implemented VHDL code can be used for performing logic synthesis and for designing an LPC Application Specific Integrated Circuit(ASOC) chip and DsP chips. We first developed the C language code to investigate the correctness of algorithms and to compare C code results with VHDL code results block by block.

  • PDF

Voice Activity Detection in Noisy Environment using Speech Energy Maximization and Silence Feature Normalization (음성 에너지 최대화와 묵음 특징 정규화를 이용한 잡음 환경에 강인한 음성 검출)

  • Ahn, Chan-Shik;Choi, Ki-Ho
    • Journal of Digital Convergence
    • /
    • v.11 no.6
    • /
    • pp.169-174
    • /
    • 2013
  • Speech recognition, the problem of performance degradation is the difference between the model training and recognition environments. Silence features normalized using the method as a way to reduce the inconsistency of such an environment. Silence features normalized way of existing in the low signal-to-noise ratio. Increase the energy level of the silence interval for voice and non-voice classification accuracy due to the falling. There is a problem in the recognition performance is degraded. This paper proposed a robust speech detection method in noisy environments using a silence feature normalization and voice energy maximize. In the high signal-to-noise ratio for the proposed method was used to maximize the characteristics receive less characterized the effects of noise by the voice energy. Cepstral feature distribution of voice / non-voice characteristics in the low signal-to-noise ratio and improves the recognition performance. Result of the recognition experiment, recognition performance improved compared to the conventional method.

A Study of the Pitch Estimation Algorithms of Speech Signal by Using Average Magnitude Difference Function (AMDF) (AMDF 함수를 이용한 음성 신호의 피치 추정 Algorithm들에 관한 연구)

  • So, Shinae;Lee, Kang Hee;You, Kwang-Bock;Lim, Ha-Young;Park, Jisu
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.4
    • /
    • pp.235-242
    • /
    • 2017
  • Peaks (or Nulls) finding algorithms for Average Magnitude Difference Function (AMDF) of speech signal are proposed in this paper. Both AMDF and Autocorrelation Function (ACF) are widely used to estimate a pitch of speech signal. It is well known that the estimation of the fundamental requency (F0) for speech signal is not only important but also very difficult. In this paper, two algorithms, are exploited the characteristics of AMDF, are proposed. First, the proposed algorithm which has a Threshold value is applied to the local minima to detect a pitch period. The Other proposed algorithm to estimate a pitch period of speech signal is utilized the relationship between AMDF and ACF. The data in this paper, is recorded by using general commercial device, is composed of Korean emotion expression words. The recorded speech data are applied to two proposed algorithms and tested their performance.

Speech Spectrum Enhancement Combined with Frequency-weighted Spectrum Shaping Filter and Wiener Filter (주파수가중 스펙트럼성형필터와 위너필터를 결합한 음성 스펙트럼 강조)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.10
    • /
    • pp.1867-1872
    • /
    • 2016
  • In the area of digital signal processing, it is necessary to improve the quality of the speech signal after removing the background noise which exists in a various real environments. The important thing to consider when removing the background noise acoustically is that to solve the problem, depending on the information of the human auditory mechanism is mainly the amplitude spectrum of the speech signal. This paper introduces the characteristics of a frequency-weighted spectrum shaping filter for the extraction of the amplitude spectrum of the speech signal with the primary purpose. Therefore, this paper proposes an algorithm using the methods of a Wiener filter and the frequency-weighted spectrum shaping filter according to the acoustic model, after extracted the amplitude spectral information in the noisy speech signal. The spectral distortion (SD) output of the proposed algorithm is experimentally improved more than 5.28 dB compared to a conventional method.

Double Talk Processing using Blind Signal Separation in Acoustic Echo Canceller (음향반향제거기에서 암묵신호분리를 이용한 동시통화처리)

  • Lee, Haengwoo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.1
    • /
    • pp.43-50
    • /
    • 2016
  • This paper is on an acoustic echo canceller solving the double-talk problem by using the blind signal separation technology. The acoustic echo canceller may be deteriorated or diverged during the double-talk period. So we use the blind signal separation to detect the double talking by separating the near-end speech signal from the mixed microphone signal. The blind signal separation extracts the near-end signal from dual microphones by the iterative computations using the 2nd order statistical character in the closed reverberation environment. By this method, the acoustic echo canceller operates irrespective of the double-talking. We verified performances of the proposed acoustic echo canceller in the computer simulations. The results show that the acoustic echo canceller with this algorithm detects the double-talk periods well, and then operates stably without diverging of the coefficients after ending the double-talking. The merits are in the simplicity and stability.

A Query-by-Speech Scheme for Photo Albuming (음성 질의 기반 디지털 사진 검색 기법)

  • Kim Tae-Sung;Suh Young-Joo;Lee Yong-Ju;Kim Hoi-Rin
    • MALSORI
    • /
    • no.57
    • /
    • pp.99-112
    • /
    • 2006
  • In this paper, we introduce two retrieval methods for photos with speech documents. We compare the pattern of speech query with those of speech documents recorded in digital cameras, and measure the similarities, and retrieve photos corresponding to the speech documents which have high similarity scores. As the first approach, a phoneme recognition scheme is used as the pre-processor for the pattern matching, and in the second one, the vector quantization (VQ) and the dynamic time warping (DTW) are applied to match the speech query with the documents in signal domain itself. Experimental results show that the performance of the first approach is highly dependent on that of phoneme recognition while the processing time is short. The second method provides a great improvement of performance. While the processing time is longer than that of the first method due to DTW, but we can reduce it by taking approximated methods.

  • PDF

A Review of Assistive Listening Device and Digital Wireless Technology for Hearing Instruments

  • Kim, Jin Sook;Kim, Chun Hyeok
    • Korean Journal of Audiology
    • /
    • v.18 no.3
    • /
    • pp.105-111
    • /
    • 2014
  • Assistive listening devices (ALDs) refer to various types of amplification equipment designed to improve the communication of individuals with hard of hearing to enhance the accessibility to speech signal when individual hearing instruments are not sufficient. There are many types of ALDs to overcome a triangle of speech to noise ratio (SNR) problems, noise, distance, and reverberation. ALDs vary in their internal electronic mechanisms ranging from simple hard-wire microphone-amplifier units to more sophisticated broadcasting systems. They usually use microphones to capture an audio source and broadcast it wirelessly over a frequency modulation (FM), infra-red, induction loop, or other transmission techniques. The seven types of ALDs are introduced including hardwire devices, FM sound system, infra-red sound system, induction loop system, telephone listening devices, television, and alert/alarm system. Further development of digital wireless technology in hearing instruments will make possible direct communication with ALDs without any accessories in the near future. There are two technology solutions for digital wireless hearing instruments improving SNR and convenience. One is near-field magnetic induction combined with Bluetooth radio frequency (RF) transmission or proprietary RF transmission and the other is proprietary RF transmission alone. Recently launched digital wireless hearing aid applying this new technology can communicate from the hearing instrument to personal computer, phones, Wi-Fi, alert systems, and ALDs via iPhone, iPad, and iPod. However, it comes with its own iOS application offering a range of features but there is no option for Android users as of this moment.

Digital Watermarking Using Psychoacoustic Model

  • Poomdaeng, S.;Toomnark, S.;Amornraksa, T.
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.872-875
    • /
    • 2002
  • A digital watermarking technique applying psychoacoustic model for audio signal is proposed in this paper. In the watermarking scheme, the pseudo-random bit stream used as a watermark signal is embedded into the audio signal in both speech and music. The strength of the embedded signal is subject to the human auditory system in such a way that the disturbances on host audio signal are beyond the sensing of human ears. The experimental results show that the quality of the watermarked audio signal, in term of signal to noise ratio, can be improved up to 3.2 dB.

  • PDF