• Title/Summary/Keyword: Voice, Sound

Search Result 338, Processing Time 0.032 seconds

Voice Personality Transformation Using an Optimum Classification and Transformation (최적 분류 변환을 이용한 음성 개성 변환)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.5
    • /
    • pp.400-409
    • /
    • 2004
  • In this paper. a voice personality transformation method is proposed. which makes one person's voice sound like another person's voice. To transform the voice personality. vocal tract transfer function is used as a transformation parameter. Comparing with previous methods. the proposed method makes transformed speech closer to target speaker's voice in both subjective and objective points of view. Conversion between vocal tract transfer functions is implemented by classification of entire vector space followed by linear transformation for each cluster. LPC cepstrum is used as a feature parameter. A joint classification and transformation method is proposed, where optimum clusters and transformation matrices are simultaneously estimated in the sense of a minimum mean square error criterion. To evaluate the performance of the proposed method. transformation rules are generated from 150 sentences uttered by three male and on female speakers. These rules are then applied to another 150 sentences uttered by the same speakers. and objective evaluation and subjective listening tests are performed.

The Perceptual and Consonant Analysis for the Voice with Hypothyroidism (갑상선 기능저하 음성에 대한 청지각적 및 파열음 분석에 대한 연구)

  • Han, Baek Hwa;Lee, Dahae;Kim, Joon Sun;Hong, Ki Hwan
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.27 no.2
    • /
    • pp.95-101
    • /
    • 2016
  • Background and Objectives : The main purpose of this study is to clarify perceptual and acoustic analysis for the patients with hypothyroidism after thyroidectomy especially focused on the characteristics of speech articulation with special reference to the consonant production. Materials and Methods : The subjects of the research were 40 male and female adults (males : 5, females : 35). They were all received radioactive iodine treatment which after total thyroidectomy. Voice samples were collected during the three stages of after surgery, pre-radioisotope treatment (RIT), and post-RIT. The acoustic analysis was conducted by using Pratt (ver.5.2.21) after measuring voice onset time (VOT). The subjective evaluation of the voices used CAPE-V. Results : A significant decrease in overall severity was displayed in the CAPE-V following RIT. It may be conjectured that this is connected to the change in voice following RIT. The loudness of the sound displayed a significant decrease in the CAPE-V following RIT. It is conjectured that this is connected to the decrease in vocal intensity following RIT. No statistically significant results were revealed for the comparative analysis on the voice onset time (VOT) in all plosives during the three periods. Conclusion : Perceptually, the overall severity of the voice with hypothyroidism was changed significantly before and after RIT. Eventhough VOT were not significantly changed, it tended to decrease VOT in patients with hypothyroidism.

  • PDF

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

  • Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.2
    • /
    • pp.121-128
    • /
    • 2008
  • Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.

  • PDF

Study on the multi-functional Cradle by Voice Recognitions (다기능성을 가진 음성 인식 요람 연구)

  • Park, Kwang-Sung;Ahn, Sang-jin;Cho, Kyeong-Rok;Choi, Si-On;Park, Yong-Wook
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.4
    • /
    • pp.701-706
    • /
    • 2017
  • In this study, existing remote control or the cradle manually drives to recognize the voice of the way and through the app the Cradle to work with a motor. In addition, the temperature and humidity sensor was mounted in the cradle, the temperature and humidity of the cradle can be checked through the LCD. Depending on the sound size of the sound sensor, the resulting value was used to indicate a value of a, b, c, and the sum of the results over 1150, the cradle was recognized as the baby's crying, then, notificate and alarm on app.

Analysis of Singing Technique of Mongolian Traditional Singing Called Khoomei (몽골 전통 발성 흐미의 발성 방법 분석에 대한 사례연구)

  • Nam, Do-Hyun;Paik, Jae-Yeon;Hwang, Yoen-Shin;Choi, Hong-Shik
    • Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.145-156
    • /
    • 2008
  • The goal of this study was to investigate acoustic and physiologic characteristics of two phonation types of 'Khoomei' which is a traditional singing style of people who live around the Altai mountains or Mongolia region. It can be produced two pitches simultaneously - high melody pitch can be perceived along with a low drone pitch. Sygyt and kargyraa styles are the most popular and identifiable styles and they can be recognized as the different sounds depending on the method of voice production. Two trained Mongolians participated and have used at least 5 - 6 years. The characteristics of this voice production were measured by using flexible fiberscope, Stroboscopy, Lx Speech studio, Spead, and Doctor Speech. In Sygyt style, very high vocal fold closure (71.50%) with both true and false vocal folds contact and strong breathing support was observed. They also showed that tongue height and harmonics were increased (around 10dB) with resonance cavity movement. In contrast, it was found that Kargyraa sound had very low pitch with relaxed stomach, less laryngeal tension and lower vocal fold contact (69.50%) than hard Sygyt style sound without raising the tongue during phonation. 'Khoomei' phonation can be made by strong contact of both true and false vocal folds and by increasing the harmonics as well.

  • PDF

Guidance to the Praat, a Software for Speech and Acoustic Analysis (음성 및 음향분석 프로그램 Praat의 임상적 활용법)

  • Seong, Cheol Jae
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.33 no.2
    • /
    • pp.64-76
    • /
    • 2022
  • Praat is a useful analysis tool for linguists, engineers, doctors, speech-language pathologits, music majors, and natural scientists. Basic parameters including duration, pitch, energy and perturbation parameters such as jitter and shimmer can be easily measured and manipulated in the sound editor. When a more in-depth analysis is needed, it is recommended to understand the advanced menus of the object window and learn how to use them. Among the object window menus, vowel formant analysis, spectrum analysis, and cepstrum analysis can be cited as useful ones in the clinical field. The spectrum object can be usefully used for voice quality measurement and diagnosis of patients with voice disorders by showing the energy distribution according to frequency axis (domain). A cepstrum object is useful for speech analysis when periodicity of the sound object is not measurable. The low to high ratio obtained from the spectral object and the CPPs measured from the cepstrum object have attracted many researchers, and it has been proven that the CPPs measured in Praat are relatively excellent.

Executive function and Korean children's stop production

  • Eun Jong Kong;Hyunjung Lee;Jeffrey J. Holliday
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2023
  • Previous studies have established a role for cognitive differences in explaining variability in speech processing across individuals. In the case of perceptual cue weighting in the context of a sound change, studies have produced conflicting results regarding the relationship between executive function and the use of redundant cues. The current study aimed to explore this relationship in acoustic cue weighting during speech production. Forty-one Korean-speaking children read a list of stop-initial words and completed two tests that assess executive function, i.e., Dimensional Change Card Sorting (DCCS) and digit n-back. Voice onset time (VOT) and fundamental frequency (F0) were measured in each word, and analyses were carried out to determine the extent to which children's executive function predicted their use of both informative and less informative cues to the three pairs comprising the Korean three-way stop laryngeal contrast. No evidence was found for a relationship between cognitive ability and acoustic cue weighting in production, which is at odds with previous, albeit conflicting, results for speech perception. While this result may be due to the lack of task demands in the production task used here, it nevertheless expands the empirical ground upon which future work in this area may proceed.

A Study on Improvement Effect of voice information transmission using Auralization at the hydraulic turbine dynamo room in Dam (가청화를 이용한 댐 수차 발전기실의 음성정보전달 개선효과에 관한 연구)

  • Kook, Joung-Hun;Ju, Duck-Hoon;Jung, Eun-Jung;Kim, Jae-Soo
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.11a
    • /
    • pp.263-267
    • /
    • 2007
  • Even though Waterpower Generation as pollution-free has its own merit of contribution by supply of good quality electricity, due to the noise made at the time of its operation, a normal mutual communication among the workers and technicians engaging at the hydraulic turbine dynamo room is almost impossible, and since those finishing materials had been used mainly by reflection material, it is actual situation that when working for maintenance in the hydraulic turbine dynamo room, as counterpart's voice vibrates, its working efficiency is difficult to ensure. On such view point, this Research has conducted Psycho-acoustics Experiment about voice Definition using Auralizational Technique, on the object for the hydraulic turbine dynamo room that improved its acoustic performance by computer simulation. As the result of Study, it was known that the clearness of sound with regard to voice information transmission was apparently improved in all items than before improvement. Therefore, it is considering that these results would be utilized usefully when renovation on the hydraulic turbine dynamo room in the future.

  • PDF

Acoustic Characteristics of Patients with Total Laryngectomees via Voice Rehabilitation Techniques (후두적출술 환자의 발성법에 따른 음향학적 특성)

  • Jang, Hyo-Ryung;Shim, Hee-Jeong;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.25-32
    • /
    • 2013
  • This research is aimed at finding the acoustic characteristics of different voice rehabilitation techniques, the electrolaryx (EL), standard esophageal (SE), and tracheoesophageal (TE), used on 17 patients with laryngectomees. The analysis of the voice qualities was achieved using MDVP. In order to compare the acoustic characteristics, patients were asked to produce the vowel /a/ sound. The acoustic analysis included fundamental frequency (f0), jitter, shimmer, and noise-to-harmonic ratio (NHR). The main acoustic results showed no significant statistical differences between the average measurements of SE and TE speakers. It was found that the current study showed the same tendency found in previous studies. There was also a significant difference between SE and EL speakers. On the other hand, there were no significant statistical differences between the average measurements of TE and EL speakers on all acoustic measurements. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation for patients with laryngectomees. In future, the present findings and issues should be considered in the context of gender. Specifically, the number of women who are diagnosed with laryngeal cancer continues to rise and their acoustic characteristics may indeed differ from those of men.

Surgery of Benign Laryngeal Mucosal Lesions (후두 양성점막 병변의 수술적 치료)

  • Jin, Sung Min
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.24 no.2
    • /
    • pp.83-87
    • /
    • 2013
  • The term "phonosurgery," coined in the early 1960s, refers to surgical procedures that maintain, restore, or enhance the human voice. Phonosurgery includes phonomicrosurgery (endoscopic microsurgery of the vocal folds), laryngoplastic phonosurgery (open-neck surgery that restructures the cartilaginous framework of the larynx and the soft tissues), laryngeal injection (injection of medications as well as synthetic and organic biologic substances), and reinnervation of the larynx. Phonomicrosurgery is a means of maximally preserving the layered microstructure of the vocal fold, that is, the epithelium and lamina propria. The purpose of the surgery is usually to improve the vibratory characteristics of the layered microstructure of the vocal folds. Phonomicrosurgery has developed from convergence of microlaryngoscopic surgical technique theory and the mucosal wave theory of laryngeal sound production. Improvements in technology (i.e., laryngoscopes, handled instruments, and lasers), which in part arise from developments in more frequently performed minimally invasive surgical procedures, will probably facilitate the next generation of procedural innovations. The best methods of optimizing phonosurgical outcomes include making an accurate diagnosis, completing a comprehensive voice evaluation, providing sufficient preoperative therapy, carefully selecting patients to undergo phonomicrosurgical procedures, and requiring sufficient postoperative rest and therapy. Phonomicrosurgery will continue to evolve as a result of the interdependent collaboration of surgeons with voice scientists, speech pathologist, and other voice professionals.

  • PDF