• Title/Summary/Keyword: speech parameter

Search Result 373, Processing Time 0.026 seconds

Pilot Study on the Classification for Sasangin by the Voice Analysis (음성분석에 의한 체질진단에 관한 연구)

  • Lee Eui-Ju;Song Kwang-Bin;Choi Hwan-Soo;Yoo Jung-Hee;Kwak Chang-Kyu;Sohn Eun-Hae;Koh Byung-Hee
    • The Journal of Korean Medicine
    • /
    • v.26 no.1 s.61
    • /
    • pp.93-102
    • /
    • 2005
  • Objective : This research was conducted to evaluate the method of sasangin classification by voice analysis, The 2 pilot tests were thus designed to solve the following problems: 'What are the conditions at classification for sasangin by the voice analysis?' and 'What are the important variances of /a/ parameter?'. Methods: 122 volunteers Were examined to make a diagnosis of sasangin by QSCC II and they were disease-free and healthy, First, they said /a/ three times for 2 seconds in their usual voice, Second, they said /a/ for 2 seconds by the different ways of high tone, mid tone, and low tone. The sounds were collected by a recording program (cooledit 2000) through a Sony microphone (ecm-26l). We analyzed the voices by maltlab, the simulation tool. Results: There were no differences and were correlations when one said /a/ three times for 2 seconds in the usual voice. There were some things to correlate when one said /a/ three times for 2 seconds by the different ways of high speech, usual speech, and low speech. Others were nothing to correlate. We evaluated the value of sasangin classification method by only /a/ voice analysis. The hit ratio was average $66.3\%\;:\;soyangin\;67.9\%,\;taeumin\;68.0\%,\;soeumin\;63.9\%$. Conclusion: We must set up the conditions to use the method of sasangin classification by voice analysis. The value of sasangin classification method by only fa! voice analysis was a hit ratio of $66.3\%$.

  • PDF

Chaotic Speech Secure Communication Using Self-feedback Masking Techniques (자기피드백 마스킹 기법을 사용한 카오스 음성비화통신)

  • Lee, Ik-Soo;Ryeo, Ji-Hwan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.6
    • /
    • pp.698-703
    • /
    • 2003
  • This paper presents analog secure communication system about safe speech transmission using chaotic signals. We applied various conditions that happen in actuality communication environment modifying chaotic synchronization and chaotic communication schemes and analyzed restoration performance of speech signal to computer simulation. In transmitter, we made the chaotic masking signal which is added voice signal to chaotic signal using PC(Pecora & Carroll) and SFB(self-feedback) control techniques and transmitted encryption signal to noisy communication channel And in order to calculate the degree of restoration performance, we proposed the definition of analog average power of recovered error signals in receiver chaotic system. The simulation results show that feedback control techniques can certify that restoration performance is superior to quantitative data than PC method about masking degree, susceptibility of parameters and channel noise. We experimentally computed the table of relation of parameter fluxion to restoration error rate which is applied the encryption key values to the chaotic secure communication.

Vocabulary Recognition Performance Improvement using a convergence of Bayesian Method for Parameter Estimation and Bhattacharyya Algorithm Model (모수 추정을 위한 베이시안 기법과 바타차랴 알고리즘을 융합한 어휘 인식 성능 향상)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.353-358
    • /
    • 2015
  • The Vocabulary Recognition System made by recognizing the standard vocabulary is seen as a decline of recognition when out of the standard or similar words. In this case, reconstructing the system in order to add or extend a range of vocabulary is a way to solve the problem. This paper propose configured Bhattacharyya algorithm standing by speech recognition learning model using the Bayesian methods which reflect parameter estimation upon the model configuration scalability. It is recognized corrected standard model based on a characteristic of the phoneme using the Bayesian methods for parameter estimation of the phoneme's data and Bhattacharyya algorithm for a similar model. By Bhattacharyya algorithm to configure recognition model evaluates a recognition performance. The result of applying the proposed method is showed a recognition rate of 97.3% and a learning curve of 1.2 seconds.

The Proposal of the Fuzzed Lyapunov Dimension at Speech Signal (음성에 대한 퍼지-리아프노프 차원의 제안)

  • In, Joon-Hawn;Yoo, Byong-Wook;Ryu, Seok-Han;Jung, Myong-Jin;Kim, Chang-Seok
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.4
    • /
    • pp.30-37
    • /
    • 1999
  • This study suggested the Fuzzy Lyapunov dimension. The Fuzzy Lyapunov dimension is to evaluate the quantitative variation of the attractor. In this paper the speaker recognition is evaluated by the Fuzzy Lyapunov dimension. It has been proved that the suggested Fuzzy Lyapunov dimension is superior in the discrimination characteristics between standard reference pattern attractors, and in reference to the test pattern attractor, it has been verified that it is the speaker recognition parameter which absorbs the pattern variation. In order to evaluate the Fuzzy Lyapunov dimension as speaker recognition parameter, the mistaken recognition according to discrimination error in each of speaker and standard reference pattern was estimated, and the validity of the speaker recognition parameter was experimental. As the result of the speaker recognition experiment, 97.0[%] of recognition ratio was obtained, and it was confirmed that the Fuzzy Lyapunov dimension was fit for the speaker recognition parameter.

  • PDF

Auto Frame Extraction Method for Video Cartooning System (동영상 카투닝 시스템을 위한 자동 프레임 추출 기법)

  • Kim, Dae-Jin;Koo, Ddeo-Ol-Ra
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.12
    • /
    • pp.28-39
    • /
    • 2011
  • While the broadband multimedia technologies have been developing, the commercial market of digital contents has also been widely spreading. Most of all, digital cartoon market like internet cartoon has been rapidly large so video cartooning continuously has been researched because of lack and variety of cartoon. Until now, video cartooning system has been focused in non-photorealistic rendering and word balloon. But the meaningful frame extraction must take priority for cartooning system when applying in service. In this paper, we propose new automatic frame extraction method for video cartooning system. At frist, we separate video and audio from movie and extract features parameter like MFCC and ZCR from audio data. Audio signal is classified to speech, music and speech+music comparing with already trained audio data using GMM distributor. So we can set speech area. In the video case, we extract frame using general scene change detection method like histogram method and extract meaningful frames in the cartoon using face detection among the already extracted frames. After that, first of all existent face within speech area image transition frame extract automatically. Suitable frame about movie cartooning automatically extract that extraction image transition frame at continuable period of time domain.

The Effect of Helium Gas Intake on the Characteristics Change of the Acoustic Organs for Voice Signal Analysis Parameter Application (음성신호 분석 요소의 적용으로 헬륨가스 흡입이 음성 기관의 특성 변화에 미치는 영향)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The KIPS Transactions:PartB
    • /
    • v.18B no.6
    • /
    • pp.397-404
    • /
    • 2011
  • In this paper, we were carried out experiments to apply parameter of voice analysis to measure changing characteristic articulator according to inhale the helium gas. The helium gas was used to overcome air embolism nitrogen gas to deal a fatal blow in body nitrogen gas by diver. However, the helium gas has been much trouble interpretation about abnormal voice of diver to cause squeaky voice of low articulation. Therefor, we was carried out experiments about pitch and spectrogram measurement, analysis based on to influence in acoustic organs before and after of inhaled helium gas.

A Proposition of the Fuzzy Correlation Dimension for Speaker Recognition (화자인식을 위한 퍼지상관차원 제안)

  • Yoo, Byong-Wook;Kim, Chang-Seok;Park, Hyun-Sook
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.1
    • /
    • pp.115-122
    • /
    • 1999
  • In this paper, we confirmed that a speech signal is a chaos signal, and in order to use it as a speaker recognition parameter, analyzed chaos dimension. In order to raise speaker identification and pattern recognition, by making up the strange attractor involving an individual's vocal tract characteristics very well and applying fuzzy membership function to correlation dimension, we proposed fuzzy correlation dimension. By estimating the correlation of the points making up an attractor are limited according space dimension value, fuzzy correlation dimension absorbed the variation of the reference pattern attractor and test pattern attractor. Concerning fuzzy correlation dimension, by estimating the distance according to the average value of discrimination error per each speaker and reference pattern, investigated the validity of speaker recognition parameter.

  • PDF

Transcoding Algorithm for SMV and G.729A Vocoders via Direct Parameter Transformation (G.729A와 SMV 음성부호화기를 위한 파라미터 직접 변환 방식의 상호부호화 알고리듬)

  • 장달원;서성호;이선일;유창동
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.71-83
    • /
    • 2003
  • In this paper, a novel transcoding algorithm for the G.729A and the Selectable Mode Vocoder(SMV) vocoders via direct parameter transformation is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm converts the parameters of one coder to the other without going through the decoding and encoding processes. In transcoder from SMV to G.729A, LSP conversion algorithm, pitch delay conversion algorithm and transcoding algorithm in lower rate are proposed, and in transcoder from G.729A to SMV, LSP conversion algorithm, pitch delay conversion algorithm and rate selection algorithm are proposed. Evaluation results show that while exhibiting better computational and delay characteristics, the proposed algorithm produces equivalent or Improved speech quality to that produced by the tandem transcoding algorithm.

Acoustic characteristics of speech-language pathologists related to their subjective vocal fatigue (언어재활사의 주관적 음성피로도와 관련된 음향적 특성)

  • Jeon, Hyewon;Kim, Jiyoun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.87-101
    • /
    • 2022
  • In addition to administering a questionnaire (J-survey), which questions individuals on subjective vocal fatigue, voice samples were collected before and after speech-language pathology sessions from 50 female speech-language pathologists in their 20s and 30s in the Daejeon and Chungnam areas. We identified significant differences in Korean Vocal Fatigue Index scores between the fatigue and non-fatigue groups, with the most prominent differences in sections one and two. Regarding acoustic phonetic characteristics, both groups showed a pattern in which low-frequency band energy was relatively low, and high-frequency band energy was increased after the treatment sessions. This trend was well reflected in the low-to-high ratio of vowels, slope LTAS, energy in the third formant, and energy in the 4,000-8,000 Hz range. A difference between the groups was observed only in the vowel energy of the low-frequency band (0-4,000 Hz) before treatment, with the non-fatigue group having a higher value than the fatigue group. This characteristic could be interpreted as a result of voice abuse and higher muscle tonus caused by long-term voice work. The perturbation parameter and shimmer local was lowered in the non-fatigue group after treatment, and the noise-to-harmonics ratio (NHR) was lowered in both groups following treatment. The decrease in NHR and the fall of shimmer local could be attributed to vocal cord hypertension, but it could be concluded that the effective voice use of speech-language pathologists also contributed to this effect, especially in the non-fatigue group. In the case of the non-fatigue group, the rhamonics-to-noise ratio increased significantly after treatment, indicating that the harmonic structure was more stable after treatment.

Feasibility of hearing aid gain self-adjustment using speech recognition (말소리 인지를 이용한 보청기 이득 자가 조절의 실현)

  • Yun, Donghyeon;Shen, Yi;Zhang, Zhuohuang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.76-86
    • /
    • 2022
  • Personal hearing devices, such as hearing aids, may be fine-tuned by allowing the users to conduct self-adjustment. Two self-adjustment procedures were developed to collect the listener preferred gains in six octave-frequency bands from 0.25 kHz to 8 kHz. These procedures were designed to allow rapid exploration of a multi-dimensional parameter space using a simple, one-dimensional user control interface (i.e., a programmable knob). The two procedures differ in whether the user interface controls the gains in all frequency bands simultaneously (Procedure A) or only the gain in one frequency band (Procedure B) on a given trial. Monte-Carlo simulations suggested that for both procedures the gain preference identified by simulated listeners rapidly converged to the ground-truth preferred gain profile over the first 20 trials. Initial behavioral evaluations of the self-adjustment procedures, in terms of test-retest reliability, were conducted using 20 young, normal-hearing listeners. Each estimate of the preferred gain profile took less than 20 minutes. The deviation between two separate estimates of the preferred gain profile, conducted at least a week apart, was about 10 dB ~ 15 dB.