• Title/Summary/Keyword: speech quality

Search Result 807, Processing Time 0.03 seconds

A Fast Pitch Searching Algorithm Using Correlation Characteristics in CELP Vocoder (상관관계 특성을 용한 CELP 보코더의 고속 피치검색 알고리듬)

  • Lee, Joo-Hun;Bae, Myung-Jin;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.2E
    • /
    • pp.20-25
    • /
    • 1994
  • The major drawback to the Code Excited Linear Prediction(CELP) type vocoders is their large computational requirements. In this paper, a simple method is proposed to reduce the pitch searching time in the pitch filter almost without degradation of quality. Bease upon the observational regularity of the correlation function of speech, the searching range can be restricted to the positive side in pitch search. This is done by skipping the negative side with the width which is estimated from the previous positive envelope. In addition to that, the maximum number of available lags can be limited by the threshold, $L_T$, which is set on 58 empirically. So, only the limited numbers of lags are considered in pitch search, which is less than a half of that of the full search method. By using the proposed method in pitch search, its required computations are greatly reduced. Experimental result shows 51% time reduction almost without lowering the speech quality in segmental SNR measure.

  • PDF

Preliminary Study for Comparison of Subjective Voice Evaluations among Vocal and Applied Music Major Students (성악과 실용음악 보컬 전공 대학생들의 주관적 음성평가 비교 예비연구)

  • Lee, Dahye;Hwang, Youngjin;Kim, Jaeock
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.37-45
    • /
    • 2014
  • The purpose of this study was to determine whether the Korean Singing Voice Handicap Index (K-SVHI) was suitable for singers in other genres than vocal music to assess their vocal problems subjectively. Twenty six college students majoring in vocal music and twenty six students majoring in applied music were included in the study. They were divided into G0 and G1 in voice quality using the GRBAS scale during the tasks of singing. K-SVHI was divided into three sub-areas (Physical, Functional, and Emotional). In the singing task, both groups showed no significant difference between K-SVHI scores by G scale. In the reading task, the vocal music group had significantly higher K-SVHI in G0 than in G1 in K-SVHIs by G scale, while the applied vocal music group had significantly higher K-SVHI in G1 than in G0. Also, the two groups were not significantly different in G0, G1 in the singing task while the vocal music group showed higher K-SVHI than the applied vocal music group in G0 in the reading task. In addition, the vocal music group had higher K-SVHI than the applied vocal music group in G1 in both tasks. As comparing by groups in three sub-areas of K-SVHI, significant differences were found in the Emotional and Functional area. Those results showed that singers felt their voice problems differently by musical genres, which means that K-SVHI may not be a proper tool for evaluating voice handicap of singers in diverse voice music genres.

Development of Advanced Personal Identification System Using Iris Image and Speech Signal (홍채와 음성을 이용한 고도의 개인확인시스템)

  • Lee, Dae-Jong;Go, Hyoun-Joo;Kwak, Keun-Chang;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.348-354
    • /
    • 2003
  • This proposes a new algorithm for advanced personal identification system using iris pattern and speech signal. Since the proposed algorithm adopts a fusion scheme to take advantage of iris recognition and speaker identification, it shows robustness for noisy environments. For evaluating the performance of the proposed scheme, we compare it with the iris pattern recognition and speaker identification respectively. In the experiments, the proposed method showed more 56.7% improvements than the iris recognition method and more 10% improvements than the speaker identification method for high quality security level. Also, in noisy environments, the proposed method showed more 30% improvements than the iris recognition method and more 60% improvements than the speaker identification method for high quality security level.

Voice Activity Detection Using Modified Power Spectral Deviation Based on Teager Energy (Teager Energy 기반의 수정된 파워 스펙트럼 편차를 이용한 음성 검출)

  • Song, J.H.;Song, Y.R.;Shim, H.M.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.8 no.1
    • /
    • pp.41-46
    • /
    • 2014
  • In this paper, we propose a novel voice activity detection (VAD) algorithm using feature vectors based on TE (teager energy). Specifically, power spectral deviation (PSD), which is used as the feature for the VAD in the IS-127 noise suppression algorithm, is obtained after the input signal is transfomed by Teager energy operator. In addition, the TE-based likelihhod ratio are derived in each frame to modifiy the PSD for further VAD. The performance of our proposed VAD algorithm are evaluated by objective testing (total error rate, receiver operating characteristics, perceptual evaluation of speech quality) under various environments, and it is found that the proposed method yields better results than conventional VAD algorithms in the non-stationary noise environments under 5 dB SNR (total error rate = 2.6% decrease, PESQ score = 0.053 improvement).

  • PDF

A Study about Voice of Patients with Chronic Obstructive Pulmonary Disease/Asthma before & after ${\beta}_2$-agonist (${\beta}_2$-촉진제 사용전후에 따른 만성폐쇄성폐질환/천식 환자의 음성 연구)

  • Kang, Young-Ae;Kim, Se-Hun;Jong, Seong-Su;Lee, Tae-Yong;Seong, Cheol-Jae
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.101-108
    • /
    • 2010
  • An inhaled salbutamol and salmeterol for chronic obstructive pulmonary disease(COPD) and asthma have been used worldwidely. But there has been few study about the voice change evoked from the post-medicine effect. To evaluate the voice influenced of short-acting and long-acting ${\beta}_2$-agonists, two experiments were carried out: one was salbutamol experiment 1 with eight patients, the other was salmeterol experiment 2 with six patients. Experiment 1 was made of two stages: premedication & postmedication. Experiment 2 was four stages: stageI was premedication, stageII was postmedication & pregaggling, stageIII was postmedication & postgaggling(100 ml with water), and stageIV was postmedication & 30 minutes later. Measured parameters were F0, F0_SD, Jitter_rap, Shimmer_apq11, HNR, BW(1, 2, 3), Intensity, and H1-H2. The mean data collected from 3 repetitions each was statistically analyzed by Wilcoxon signed rank test for experiment 1 and repeated measures ANOVA for experiment 2. In experiment 1, significant differences were found in the Jitter_rap(Z= -2.10, p=0.036). The findings indicated that the postmedicated voice was worse than premedicated voice. In experiment 2, there wasn't significant difference, but values of parameters related to voice quality(Jitter_rap, Shimmer_apq11, HNR, and H1-H2) showed changes toward stageⅣ, that is, the voice quality was worse under medication.

  • PDF

Development of Korean-to-English and English-to-Korean Mobile Translator for Smartphone (스마트폰용 영한, 한영 모바일 번역기 개발)

  • Yuh, Sang-Hwa;Chae, Heung-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.3
    • /
    • pp.229-236
    • /
    • 2011
  • In this paper we present light weighted English-to-Korean and Korean-to-English mobile translators on smart phones. For natural translation and higher translation quality, translation engines are hybridized with Translation Memory (TM) and Rule-based translation engine. In order to maximize the usability of the system, we combined an Optical Character Recognition (OCR) engine and Text-to-Speech (TTS) engine as a Front-End and Back-end of the mobile translators. With the BLEU and NIST evaluation metrics, the experimental results show our E-K and K-E mobile translation equality reach 72.4% and 77.7% of Google translators, respectively. This shows the quality of our mobile translators almost reaches the that of server-based machine translation to show its commercial usefulness.

Acoustic analysis of wet voice among patients with swallowing disorders (삼킴장애 환자의 wet voice 관련 음향학적 분석)

  • Kang, Young Ae;Koo, Bon Seok;Kwon, In Sun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.147-154
    • /
    • 2018
  • Wet voice quality (WVQ) is a characteristic that appears after swallowing. Although the concept is accepted by many clinicians worldwide, it is nevertheless ambiguous. In this study, we investigated WVQ in patients with swallowing disorders using acoustic analysis. A total of 106 patients diagnosed with penetration-aspiration by the videofluoroscopic swallowing study (VFSS) were recruited. A voice recording of vowel /a/ was conducted before and after the VFSS, and an acoustic analysis was then performed using PRAAT. Voice after VFSS was used for a perceptual judgment and divided into two groups: the Wet group (48 patients) and the Non-wet group (58 patients). At the post-VFSS stage, the two groups displayed significant differences in many acoustic parameters including F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP. The parameter affecting judging wetness resulted into Jitter and NHR by the logistic regression test. At the pre-VFSS stage, the two groups differed significantly in many acoustic parameters including Intensity, Jitter, RAP, Shimmer, NHR, FUF, DVB, and CPP. Both pre-and post-VFSS, the mean values of all significant parameters, except Intensity, HNR, and CPP, were higher in the Wet group. According to pre-and post-VFSS, the two groups displayed interactions in many parameters (Intensity, F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP). In particular, Intensity increased in both groups after the VFSS, although the increase in the Non-wet group was greater. Based on these results, it was conjectured that the WVQ after swallowing resulted from the secretion effect of the mucous membrane due to the dry laryngeal characteristic of elderly patients, rather than aspiration resulting in food on the vocal cords.

Efficacy of laughing voice treatment (SKMVTT) in benign vocal fold lesions (양성성대질환의 웃음 음성치료(SKMVTT))

  • Jung, Dae-Yong;Wi, Joon-Yeol;Kim, Seong-Tae
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.155-161
    • /
    • 2018
  • The purpose of this study was to evaluate the efficacy of a multiple voice therapy technique ($SKMVTT^{(R)}$) using laughter for the treatment of various benign vocal fold lesions. To achieve this, 23 female patients diagnosed with vocal nodules, vocal polyp, and muscle tension dysphonia through videostroboscopy were enrolled in vocal hygiene and $SKMVTT^{(R)}$. All of the patients were treated once a week for 4 to 12 sessions. The GRBAS scale was used to confirm the changes in voice quality before and after the treatment. Acoustic analysis was performed to evaluate jitter, shimmer, NHR, fundamental frequency variation, amplitude variation, PFR, and dB range. Videostroboscopy was performed to confirm the changes in the laryngeal features before and after the treatment. After the $SKMVTT^{(R)}$, the results of the perceptual evaluation demonstrated that the G, R, and B scales significantly improved. An acoustic evaluation also demonstrated that jitter, shimmer, NHR, vAm, vFo, PFR, and dB range also significantly improved after the $SKMVTT^{(R)}$. In comparison to the videostroboscopic findings, the size of the vocal nodules and vocal polyp decreased or disappeared after the treatment. In addition, the size of the cuneiform tubercles decreased, the length of the aryepiglottic folds became longer, and the laryngeal findings of the supraglottic compressions improved after the $SKMVTT^{(R)}$. These results suggest that the $SKMVTT^{(R)}$ is effective in improving the vocal quality of patients with benign vocal fold lesions. In conclusion, it seems that laughter and inspiratory phonation suppressed abnormal laryngeal elevation and lowered laryngeal height, which seems to have the effect of improving hyperfunctional phonation.

The Efficacy of Visual Activity Schedule Intervention in Reducing Problem Behaviors in Children With Attention-Deficit/Hyperactivity Disorder Between the Age of 5 and 12 Years: A Systematic Review

  • Thomas, Naveena;Karuppali, Sudhin
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.33 no.1
    • /
    • pp.2-15
    • /
    • 2022
  • Objectives: Children with attention-deficit/hyperactivity disorder (ADHD) tend to be noisy and violate rules with their disruptive behaviors, resulting in greater difficulties with off-task behaviors and being at risk for social refusal. The visual activity schedule (VAS) intervention program is a frequently used method to teach multiple skills involving on-task, use of schedules, transition behaviors, social initiation, independent play skills, classroom skills, and academic skills. The current systematic review aimed to examine the efficacy of using VAS intervention in reducing problem behaviors in children with ADHD between 5 and 12 years of age. Methods: Systematic searches were conducted using two electronic databases (PubMed and Scopus) to identify relevant studies published in English between 2010 and 2020. Four studies met the inclusion criteria: two studies examined the effect of schedule-based tasks and the use of an iPad on classroom skills, while the other two examined randomized clinical trials (RCTs) of psychosocial treatment for ADHD inattentive type and a cross-sectional study examined the impact of the group size on task behavior and work productivity in children with ADHD. Results: The findings indicate that the interventions used in all four studies could lead to increased satisfaction among participants and parents, as well as a reduction in problem behavior. In terms of the research indicators, the RCT had low quality, while the others were of high quality. Conclusion: A larger number of studies and the ADHD clinical population would help to increase the generalizability of future reviews of treatments in this context.

The Medial Sural Artery Perforator Flap versus Other Free Flaps in Head and Neck Reconstruction: A Systematic Review

  • Yasser Al Omran;Ellie Evans;Chloe Jordan;Tiffanie-Marie Borg;Samar AlOmran;Sarvnaz Sepehripour;Mohammed Ali Akhavani
    • Archives of Plastic Surgery
    • /
    • v.50 no.3
    • /
    • pp.264-273
    • /
    • 2023
  • The medial sural artery perforator (MSAP) flap is a versatile fasciocutaneous flap, and yet is less commonly utilized than other free flaps in microvascular reconstructions of the head and neck. The aim is to conduct a high-quality Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA)- and Assessment of Multiple Systematic Reviews 2 (AMSTAR 2)-compliant systematic review comparing the use of the MSAP flap to other microvascular free flaps in the head and neck. Medline, Embase, and Web of Science databases were searched to identify all original comparative studies comparing patients undergoing head and neck reconstruction with an MSAP flap to the radial forearm free flap (RFFF) or anterolateral thigh (ALT) flap from inception to February 2021. Outcome studied were the recipient-site and donor-site morbidities as well as speech and swallow function. A total of 473 articles were identified from title and abstract review. Four studies met the inclusion criteria. Compared with the RFFF and the ALT flaps, the MSAP flap had more recipient-site complications (6.0 vs 10.4%) but less donor-site complications (20.2 vs 7.8%). The MSAP flap demonstrated better overall donor-site appearance and function than the RFFF and ALT flaps (p = 0.0006) but no statistical difference in speech and swallowing function following reconstruction (p = 0.28). Although higher quality studies reviewing the use of the MSAP flap to other free flaps are needed, the MSAP flap provides a viable and effective reconstructive option and should be strongly considered for reconstruction of head and neck defects.