• Title/Summary/Keyword: Speech signal

Search Result 1,175, Processing Time 0.024 seconds

Implementation of Adaptive Multi Rate (AMR) Vocoder for the Asynchronous IMT-2000 Mobile ASIC (IMT-2000 비동기식 단말기용 ASIC을 위한 적응형 다중 비트율 (AMR) 보코더의 구현)

  • 변경진;최민석;한민수;김경수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.56-61
    • /
    • 2001
  • This paper presents the real-time implementation of an AMR (Adaptive Multi Rate) vocoder which is included in the asynchronous International Mobile Telecommunication (IMT)-2000 mobile ASIC. The implemented AMR vocoder is a multi-rate coder with 8 modes operating at bit rates from 12.2kbps down to 4.75kbps. Not only the encoder and the decoder as basic functions of the vocoder are implemented, but VAD (Voice Activity Detection), SCR (Source Controlled Rate) operation and frame structuring blocks for the system interface are also implemented in this vocoder. The DSP for AMR vocoder implementation is a 16bit fixed-point DSP which is based on the TeakLite core and consists of memory block, serial interface block, register files for the parallel interface with CPU, and interrupt control logic. Through the implementation, we reduce the maximum operating complexity to 24MIPS by efficiently managing the memory structure. The AMR vocoder is verified throughout all the test vectors provided by 3GPP, and stable operation in the real-time testing board is also proved.

  • PDF

Pronunciation Influence Analysis of Carbonate Drink and Eucalyptus Fragrance by Applying Speech Signal Processing Techniques (음성신호 처리 기술을 적용한 탄산음료와 유칼립투스 발향이 발음에 미치는 영향 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5C
    • /
    • pp.420-428
    • /
    • 2012
  • One of the most important means in modern NQ emphasized smart society is the communication skill. Especially, effects on improving pronunciation accuracy, it is mostly necessary to accurately express his or her own idea due to the personal relation influence 38% of voice. For this, this paper proposed the voice influence analysis of carbonate drink and eucalyptus fragrance. In particular, in the case of carbonate drink, the amounts of drinking accumulation is verified for analysing the drinking accumulation influence. Also, eucalyptus fragrance is reported for influencing the pronunciation accuracy. For this, jitter, shimmer, pitch and intensity of voice is analyzed. Finally, we accomplish an voice analysis of quantization, objective and visualization for such carbonate drink and eucalyptus fragrance.

Comparison of Clinical Usefulness of Program-Assisted and Real Ear Measurement-Assisted Hearing Aids Fitting (프로그램과 실이 측정을 이용한 보청기 적합의 임상적 유용성의 비교)

  • Chang, Young-Soo;Jung, Hye Im;Cho, Yang-Sun
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.61 no.12
    • /
    • pp.663-668
    • /
    • 2018
  • Background and Objectives The main objectives of this study were to determine the clinical usefulness of the program-assisted and real ear measurement (REM)-assisted fitting of hearing aids. Subjects and Method Fifteen participants with moderate to moderately severe hearing loss were enrolled in this study. Objective and subjective fitting results were assessed to compare the benefits between the program-assisted fitting (using a software fitting program) and the REM-assisted fitting. Real ear insertion gain (REIG), sound-field audiometry using warble tone, and Korean Hearing in Noise Test (K-HINT) were performed as objective tests. Sound quality rating was performed as a subjective test. Results In the program fitting, 48.89% of fitting points failed to come within ${\pm}10dB$ of the REIG target. In the REM fitting, however, the percentage of failure significantly decreased to 23.33% (p=0.013). In K-HINT test, the reception threshold for speech in quiet situation significantly decreased from 50.1 dB HL with the program fitting to 44.7 dB HL after the REM fitting (p<0.001). In front noise condition, signal-to-noise ratio improved from 4.53 dB to 3.50 dB with the REM fitting without statistical significance (p=0.099). In the sound quality rating, the REM fitting ($4.27{\pm}0.56$) showed a significantly better sound quality ratings than the program fitting ($3.69{\pm}0.74$) (p=0.017). Conclusion The REM fitting showed better results in both subjective and objective measurements than the program fitting.

Noise Statistics Estimation Using Target-to-Noise Contribution Ratio for Parameterized Multichannel Wiener Filter (변수내장형 다채널 위너필터를 위한 목적신호대잡음 기여비를 이용한 잡음추정기법)

  • Hong, Jungpyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1926-1933
    • /
    • 2022
  • Parameterized multichannel Wiener filter (PMWF) is a linear filter that can control the trade-off between residual noise and signal distortion using the embedded parameter. To apply the PMWF to noisy inputs, accurate noise estimation is important and multichannel minima-controlled recursive averaging (MMCRA) is widely used. However, in the case of the MMCRA, the accuracy of noise estimation decreases when a directional interference is involved into the array inputs. Consequently, the performance of the PMWF is degraded. Therefore, we propose a noise power spectral density (PSD) estimation method for the PMWF in this paper. The proposed method is based on a consecutive process of eigenvalue decomposition on noisy input PSD, estimation of the target component contribution using directional information, and exponential weighting for improved estimation of the target contribution. For evaluation, four objective measures were compared with the MMCRA and we verify that the PMWF with the proposed noise estimation method can improve performance in environments where directional interfereces exist.

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector (우도비 특징 벡터를 이용한 SVM 기반의 음성 검출기)

  • Jo, Q-Haing;Kang, Sang-Ki;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.

Ultrasensitive Crack-based Mechanosensor Inspired by Spider's Sensory Organ (거미의 감각기관을 모사한 초민감 균열기반 진동압력센서)

  • Suyoun Oh;Tae-il Kim
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.31 no.1
    • /
    • pp.1-6
    • /
    • 2024
  • Spiders detect even tiny vibrations through their vibrational sensory organs. Leveraging their exceptional vibration sensing abilities, they can detect vibrations caused by prey or predators to plan attacks or perceive threats, utilizing them for survival. This paper introduces a nanoscale crack-based sensor mimicking the spider's sensory organ. Inspired by the slit sensory organ used by spiders to detect vibrations, the sensor with the cracks detects vibrations and pressure with high sensitivity. By controlling the depth of these cracks, they developed a sensor capable of detecting external mechanical signals with remarkable sensitivity. This sensor achieves a gauge factor of 16,000 at 2% strain with an applied tensile stress of 10 N. With high signal-to-noise ratio, it accurately recognizes desired vibrations, as confirmed through various evaluations of external force and biological signals (speech pattern, heart rate, etc.). This underscores the potential of utilizing biomimetic technology for the development of new sensors and their application across diverse industrial fields.

Study on development of the remote control door lock system including speeker verification function in real time (화자 인증 기능이 포함된 실시간 원격 도어락 제어 시스템 개발에 관한 연구)

  • Kwon, Soon-Ryang
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.6
    • /
    • pp.714-719
    • /
    • 2005
  • The paper attempts to design and implement the system which can remotely check visitors' speech or Image by a mobile phone. This system is designed to recognize who a visitor is through the automatic calling service, not through a short message, via the mobile phone, even when the home owner is outside. In general, door locks are controlled through the home Server, but it is more effective to control door locks by using DTMF signal from a real-time point of view. The technology suggested in this paper makes it possible to communicate between the visiter and the home owner by making a phone call to tile home owner's mobile phone automatically when the visiter visits the house even if the home owner is outside, and if necessary, it allows for the home owner to control the door lock remotely. Thanks to the system, the home owner is not restricted by time or space for checking the visitor's identification and controlling the door lock. In addition, the security system is improved by changing from the existing password form to the combination of password and speaker verification lot the verification procedure required for controlling the door lock and setting the environment under consideration of any disadvantages which may occur when the mobile Phone is lost. Also, any existing problems such as reconnection to tile network for controlling tile door lock are solved by controlling the door lock in real time by use of DTMF signal while on the phone.

Performance Analysis of a Statistical Packet Voice/Data Multiplexer (통계적 패킷 음성 / 데이터 다중화기의 성능 해석)

  • 신병철;은종관
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.11 no.3
    • /
    • pp.179-196
    • /
    • 1986
  • In this paper, the peformance of a statistical packet voice/data multiplexer is studied. In ths study we assume that in the packet voice/data multiplexer two separate finite queues are used for voice and data traffics, and that voice traffic gets priority over data. For the performance analysis we divide the output link of the multiplexer into a sequence of time slots. The voice signal is modeled as an (M+1) - state Markov process, M being the packet generation period in slots. As for the data traffic, it is modeled by a simple Poisson process. In our discrete time domain analysis, the queueing behavior of voice traffic is little affected by the data traffic since voice signal has priority over data. Therefore, we first analyze the queueing behavior of voice traffic, and then using the result, we study the queueing behavior of data traffic. For the packet voice multiplexer, both inpur state and voice buffer occupancy are formulated by a two-dimensional Markov chain. For the integrated voice/data multiplexer we use a three-dimensional Markov chain that represents the input voice state and the buffer occupancies of voice and data. With these models, the numerical results for the performance have been obtained by the Gauss-Seidel iteration method. The analytical results have been verified by computer simylation. From the results we have found that there exist tradeoffs among the number of voice users, output link capacity, voic queue size and overflow probability for the voice traffic, and also exist tradeoffs among traffic load, data queue size and oveflow probability for the data traffic. Also, there exists a tradeoff between the performance of voice and data traffics for given inpur traffics and link capacity. In addition, it has been found that the average queueing delay of data traffic is longer than the maximum buffer size, when the gain of time assignment speech interpolation(TASI) is more than two and the number of voice users is small.

  • PDF

MATERIALS AND METHODS FOR TEACHING INTONATION

  • Ashby, Michael
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.228-229
    • /
    • 1997
  • 1 Intonation is important. It cannot be ignored. To convince students of the importance of intonation, we can use sentences with two very different interpretations according to intonation. Example: "I thought it would rain" with a fallon "rain" means it did not rain, but with a fall on "thought" and a rise on "rain" it means that it did rain. 2 Although complex, intonation is structured. For both teacher and student, the big job of tackling intonation is made simpler by remembering that intonation can be analysed into systems and units. There are three main systems in English intonation: Tonality (division into phrases) Tonicity (selection of accented syllables) Tone (the choice of pitch movements) Examples: Tonality: My brother who lives in London is a doctor. Tonicity: Hello. How ARE you. Hello. How are YOU. Tone: Ways to say "Thank you" 3 In deciding what to teach, we must distinguish what is universal from what is specifically English. This is where contrastive studies of intonation are very valuable. Usually, for instance, division into phrases (tonality) works in broadly similar ways across languages. Some uses of pitch are also similar across languages - for example, very high pitch may signal excitement or urgency. 4 Although most people think that intonation is mainly about pitch (the tone system), actually accent placement (tonicity) is probably the single most important aspect of English intonation. This is because it is connected with information focus, and the effects on interpretation are very clear-cut. Example: They asked for coffee, so I made them coffee. (The second occurrence of "coffee" must not be accented). 5 Ear-training is the beginning of intonation training in the VeL approach. First, students learn to identify fall vs rise vs fall-rise. To begin with, single words are used, then phrases and sentences. When learning tones, the fIrst words used should have unstressed syllables after the stressed syllable (Saturday) to make the pitch movement clearer. 6 In production drills, the fIrst thing is to establish simple neutral patterns. There should be no drama or really special meanings. Simple drills can be used to teach important patterns: Example: A: Peter likes football B: Yes JOHN likes football TOO A: Mary rides a bike B: Yes JENny rides a bike TOO 7 The teacher must be systematic and let learners KNOW what they are learning. It is no good using new patterns and hoping that students will "pick them up" without noticing. 8 Visual feedback of fundamental frequency with a computer display can help students learn correct patterns. The teacher can use the display to demonstrate patterns, or students can practise by themselves, imitating recorded models.

  • PDF

Korean Phoneme Recognition Using Self-Organizing Feature Map (SOFM 신경회로망을 이용한 한국어 음소 인식)

  • Jeon, Yong-Koo;Yang, Jin-Woo;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2
    • /
    • pp.101-112
    • /
    • 1995
  • In order to construct a feature map-based phoneme classification system for speech recognition, two procedures are usually required. One is clustering and the other is labeling. In this paper, we present a phoneme classification system based on the Kohonen's Self-Organizing Feature Map (SOFM) for clusterer and labeler. It is known that the SOFM performs self-organizing process by which optimal local topographical mapping of the signal space and yields a reasonably high accuracy in recognition tasks. Consequently, SOFM can effectively be applied to the recognition of phonemes. Besides to improve the performance of the phoneme classification system, we propose the learning algorithm combined with the classical K-mans clustering algorithm in fine-tuning stage. In order to evaluate the performance of the proposed phoneme classification algorithm, we first use totaly 43 phonemes which construct six intra-class feature maps for six different phoneme classes. From the speaker-dependent phoneme classification tests using these six feature maps, we obtain recognition rate of $87.2\%$ and confirm that the proposed algorithm is an efficient method for improvement of recognition performance and convergence speed.

  • PDF