Search | Korea Science

Applying the Bi-level HMM for Robust Voice-activity Detection

Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
- Journal of Electrical Engineering and Technology
- /
- v.12 no.1
- /
- pp.373-377
- /
- 2017
This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.
https://doi.org/10.5370/JEET.2017.12.1.373 인용 PDF KSCI

The Study for Advancing the Performance of Speaker Verification Algorithm Using Individual Voice Information (개별 음향 정보를 이용한 화자 확인 알고리즘 성능향상 연구)

Lee, Je-Young;Kang, Sun-Mee
- Speech Sciences
- /
- v.9 no.4
- /
- pp.253-263
- /
- 2002
In this paper, we propose new algorithm of speaker recognition which identifies the speaker using the information obtained by the intensive speech feature analysis such as pitch, intensity, duration, and formant, which are crucial parameters of individual voice, for candidates of high percentage of wrong recognition in the existing speaker recognition algorithm. For testing the power of discrimination of individual parameter, DTW (Dynamic Time Warping) is used. We newly set the range of threshold which affects the power of discrimination in speech verification such that the candidates in the new range of threshold are finally discriminated in the next stage of sound parameter analysis. In the speaker verification test by using voice DB which consists of secret words of 25 males and 25 females of 8 kHz 16 bit, the algorithm we propose shows about 1% of performance improvement to the existing algorithm.
PDF

A Correlation Study among Acoustic Parameters of MDVP, Praat, and Dr. Speech (MDVP와 Praat, Dr. Speech간의 음향학적 측정치에 관한 상관연구)

Yoo, Jae-Yeon;Jeong, Ok-Ran;Jang, Tae-Yeoub;Ko, Do-Heung
- Speech Sciences
- /
- v.10 no.3
- /
- pp.29-36
- /
- 2003
The purposes of this study was to conduct a correlational analysis among $F_^{0}$, Jitter, Shimmer, and NHR (HNR), and NNE estimated by three speech analysis softwares, MDVP, Praat and Dr. Speech. Thirty females and 15 males with normal voice participated in the study. We used Sound Forge 6.0 to record their voice. MDVP, Praat and Dr. Speech were used to measure the acoustic parameters. The Pearson correlation coefficient was determined through a statistical analysis. The results came out as follows: Firstly, there was a strong correlation between $F_^{0}$ and Shimmer of both instruments. However, there was no correlation between Jitter of both instruments. Secondly, Shimmer showed a stronger correlation with HNR, NHR, and NNE than Jitter. Therefore, Shimmer was considered to be more useful and sensitive parameter to identify dysphonic voice compared to jitter.
PDF

Correlation between Vocal Indicators and Buzhongyiqi-Tang Questionnaire (음성 진단 지표와 보중익기탕 적응증과의 상관성 연구)

Cho, Shin-Woong;Park, Young-Bae;Park, Young-Jae
- The Journal of the Society of Korean Medicine Diagnostics
- /
- v.13 no.1
- /
- pp.81-88
- /
- 2009
Purpose : To find out correlation between the Vocal indicators and the 'Buzhongyiqi-Tang questionnaire'. Method : The 'Buzhongyiqi-Tang questionnaire' is given to 83 healthy adults and collected their voice /a/ /e/ /i/ /o/ /u/. Analyze mean values of each factors of the Vocal indicators and the 'Buzhongyiqi-Tang questionnaire'. Conclusions : R square values of /i/ sound in factor 1 and factor 3 of 'Buzhongyiqi-Tang questionnaire' are noticeably high. The value of vocal diagnosis index F0, Fundamental Frequency, in Factor 1 and factor 3 of 'Buzhongyiqi-Tang questionnaire' are considerable. The research has shown conclusively that there is a link between The value of vocal diagnosis index F0 and Factor 3, lung deficiency factor, of 'Buzhongyiqi-Tang questionnaire'.
PDF

Implementation of Voice Codec using APC Algorithm for INMARSAT-B (APC(Adaptive Predictive Coder) 알고리즘을 응용한 INMARSAT-B Voice Codec구현)

Lee, Chae-Ho;Hwang, Yun-Ho;Kim, Jeong-Hun;Lim, Jong-Kun;Bae, Jung-Chul;Choi, Woo-Jin;Lee, Joon-Tark
- Proceedings of the KIEE Conference
- /
- 1999.07g
- /
- pp.3246-3248
- /
- 1999
The APC is a coding algorithm which has the middle property of both Wave Coding(ex ADPCM) and Vocoding(ex CELP) and can decode a proper quality of sound by using scalar quantizer instead of vector quantizer at computation a low calculation. So, the APC required for Voice Codec of INMARSAT-B could be successfully implemented by full duplex using TMS32OC30(DSP).
PDF

Vocal Function After Surgical Correction of the Bowing Vocal Cords (성대 Bowing의 술전.후 음성기능)

정광윤;최종욱;한동수
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.6 no.1
- /
- pp.9-15
- /
- 1995
Bowing of the vocal cords may be due to aging. atrophy. bilateral superior laryngeal nerve paralysis. injudicious vocal cord surgery, of an idiopathic cause. The bowing usually produces a dysphonia characterized by breathiness due to air escape : however, it can produce aphonia. This report reviews vocal function after surgical correction of bowing of the vocal cords for diagnosis and management. The vocal function of 13 patients with sulcus vocalis and 12 patients with vocal cord atrophy was evaluated with the use of a test battery of multidimensional evaluation items. The voice was improved postoperatively in most patients. The voice improvement was reflected objectively in maximum phonation time, mean air flow rate during phonation, stroboscopic findings. sound pressure level range and fundamental frequency range of phonation, and results of acoustic analyses of tape-recorded voice. The vocal function after surgical correction of the sulcus vocalis and vocal cord atrophy was improved postoperatively in most patient, but the results were not satisfactory.
PDF

General Principles in Phonomicrosugery (후두미세수술의 기본 원칙)

Jin, Sung-Min
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.21 no.2
- /
- pp.101-104
- /
- 2010
The origin and growth of laryngology is inseparably linked to the development of endoscopic surgery of the larynx. Phonomicrosurgery is a means of maximally preserving the layered microstructure of the vocal fold, that is, the epithelium and lamina propria. Phonomicrosurgery has developed from convergence of micro laryngoscopic surgical technique theory and the mucosal wave theory of laryngeal sound production. Improvements in technology (i.e., laryngoscopes, handled instruments, and lasers), which in part arise from developments in more frequently performed minimally invasive surgical procedures, will probably facilitate the next generation of procedural innovations. The best methods of optimizing phonosurgical outcomes include making an accurate diagnosis, completing a comprehensive voice evaluation, providing sufficient preoperative therapy, carefully selecting patients to undergo phonomicrosurgical procedures, and requiring sufficient postoperative rest and therapy. Phonomicrosurgery will continue to evolve as a result of the interdependent collaboration of surgeons with voice scientists, speech pathologist, and other voice professionals.
PDF

A Study on the Self-voice Suppression Algorithm in a ZigBee CROS Hearing Aid (지그비 크로스 보청기에서의 자기음성 억제 알고리즘 연구)

Im, Won-Jin;Goh, Young-Hwan;Jeon, Yu-Yong;Kil, Se-Kee;Yoon, Kwang-Sub;Lee, Sang-Min
- Journal of IKEEE
- /
- v.13 no.3
- /
- pp.62-71
- /
- 2009
In this study, we developed a wireless CROS(contralateral routing of signal) hearing aid for unilateral impaired people. CROS hearing aid takes sound from an ear with poorer hearing and transmit to another ear with better hearing. Generally, the self-voice delivered through the receiver of CROS hearing aid can be very loud. It is hard to perceive target speech because of loud self-voice. To compensate it, a self-voice suppression algorithm has been developed. we performed SDT(speech discrimination test) for evaluation of the self-voice suppression algorithm. One-syllable words was used as test speech and recorded with self-voice at a 1m distance. As the results, SDT score was improved about 11% when the self-voice suppression algorithm was processed. It is verified that the self-voice suppression algorithm helps speech perception at a time to communicate with others.
PDF

Use of Pansori for Developing Actor's Aesthetic Voice (배우의 미학적 발성을 위한 판소리의 활용방안)

Lee, Ki-Ho
- The Journal of the Korea Contents Association
- /
- v.9 no.12
- /
- pp.181-192
- /
- 2009
The purpose of this research is to investigate appropriate usage of pansori's method of breathing, sound making, and resonance in order to develop actor's aesthetic voice. Today's theatre no longer see inter-cultural approach as new or experimental, but see it as a part of global current. Actors are required to integrate some global-ness into their acting. It's not enough, however, for actors to equip some cosmopolitan sensibility. More important thing is that they should be able to integrate one's own culture and aesthetic into their performance. Only after acquiring one's own cultural identity, it is possible to step into inter-cultural work. It is fundamental, therefore, for actors to assimilate traditional movement and aesthetic voice. It's been known that traditional Korean voice traits are well preserved in Pansori. In this paper, based upon well-known theories and practices of western voice training, pansori's principles and practices are utilized to bring a new aesthetic voice.
https://doi.org/10.5392/JKCA.2009.9.12.181 인용 PDF

Application and Technology of Voice Synthesis Engine for Music Production (음악제작을 위한 음성합성엔진의 활용과 기술)

Park, Byung-Kyu
- Journal of Digital Contents Society
- /
- v.11 no.2
- /
- pp.235-242
- /
- 2010
Differently from instruments which synthesized sounds and tones in the past, voice synthesis engine for music production has reached to the level of creating music as if actual artists were singing. It uses the samples of human voices naturally connected to the different levels of phoneme within the frequency range. Voice synthesis engine is not simply limited to the music production but it is changing cultural paradigm through the second creations of new music type including character music concerts, media productions, albums, and mobile services. Currently, voice synthesis engine technology makes it possible that users input pitch, lyrics, and musical expression parameters through the score editor and they mix and connect voice samples brought from the database to sing. New music types derived from such a development of computer music has sparked a big impact culturally. Accordingly, this paper attempts to examine the specific case studies and the synthesis technologies for users to understand the voice synthesis engine more easily, and it will contribute to their variety of music production.
PDF KSCI

Search Result 336, Processing Time 0.036 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)