Search | Korea Science

Voice Activity Detection using Motion and Variation of Intensity in The Mouth Region (입술 영역의 움직임과 밝기 변화를 이용한 음성구간 검출 알고리즘 개발)

Kim, Gi-Bak;Ryu, Je-Woong;Cho, Nam-Ik
- Journal of Broadcast Engineering
- /
- v.17 no.3
- /
- pp.519-528
- /
- 2012
Voice activity detection (VAD) is generally conducted by extracting features from the acoustic signal and a decision rule. The performance of such VAD algorithms driven by the input acoustic signal highly depends on the acoustic noise. When video signals are available as well, the performance of VAD can be enhanced by using the visual information which is not affected by the acoustic noise. Previous visual VAD algorithms usually use single visual feature to detect the lip activity, such as active appearance models, optical flow or intensity variation. Based on the analysis of the weakness of each feature, we propose to combine intensity change measure and the optical flow in the mouth region, which can compensate for each other's weakness. In order to minimize the computational complexity, we develop simple measures that avoid statistical estimation or modeling. Specifically, the optical flow is the averaged motion vector of some grid regions and the intensity variation is detected by simple thresholding. To extract the mouth region, we propose a simple algorithm which first detects two eyes and uses the profile of intensity to detect the center of mouth. Experiments show that the proposed combination of two simple measures show higher detection rates for the given false positive rate than the methods that use a single feature.
https://doi.org/10.5909/JBE.2012.17.3.519 인용 PDF KSCI

Effects of breathing training in melodic intonation therapy on articulation intelligibility of aphasics: pilot study (멜로디 억양 치료에서 실어증 환자의 조음 명료도에 대한 호흡 훈련 효과: 초기 실험)

Kim, Seon Sik;Hong, Geum Na;Choi, Min Joo
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.4
- /
- pp.319-329
- /
- 2016
The present study was to test if breathing training in melodic intonation therapy (MIT) ameliorated the articulation intelligibility of Broca's aphasics or not. The experimental group did breathing training (2 stages) that preceded the MIT. In order to evaluate the efficacy of the MIT intervention, the VOT (Voice Onset Time), the TD (Total Delay), the voice sound intensity and the expiratory volume of the subjects, closely associated with articulation intelligibility were measured before and after the intervention. It was shown that, in the experimental group after the MIT intervention, the VOT and TD were increased on bilabial/p/, alveolar consonant /t/, and soft palatal /k/(p < 0.05), but no significant differences were found on affricate /c/ and fricative /s/(p > 0.05). In the control group, no significant increases in the VOT and TD were observed on all articulation points(p > 0.05). The voice sound intensity which influences the verbal articulation increased in the experimental group after the intervention(p < 0.05), whereas no significant changes were observed in the control group. In conclusion, the breathing training in the MIT was found to result in improving the articulation intelligibility of Broca's aphasiacs.
https://doi.org/10.7776/ASK.2016.35.4.319 인용 PDF KSCI

A Study about Voice of Patients with Chronic Obstructive Pulmonary Disease/Asthma before & after ${\beta}_2$-agonist (${\beta}_2$-촉진제 사용전후에 따른 만성폐쇄성폐질환/천식 환자의 음성 연구)

Kang, Young-Ae;Kim, Se-Hun;Jong, Seong-Su;Lee, Tae-Yong;Seong, Cheol-Jae
- Phonetics and Speech Sciences
- /
- v.2 no.2
- /
- pp.101-108
- /
- 2010
An inhaled salbutamol and salmeterol for chronic obstructive pulmonary disease(COPD) and asthma have been used worldwidely. But there has been few study about the voice change evoked from the post-medicine effect. To evaluate the voice influenced of short-acting and long-acting ${\beta}_2$-agonists, two experiments were carried out: one was salbutamol experiment 1 with eight patients, the other was salmeterol experiment 2 with six patients. Experiment 1 was made of two stages: premedication & postmedication. Experiment 2 was four stages: stageI was premedication, stageII was postmedication & pregaggling, stageIII was postmedication & postgaggling(100 ml with water), and stageIV was postmedication & 30 minutes later. Measured parameters were F0, F0_SD, Jitter_rap, Shimmer_apq11, HNR, BW(1, 2, 3), Intensity, and H1-H2. The mean data collected from 3 repetitions each was statistically analyzed by Wilcoxon signed rank test for experiment 1 and repeated measures ANOVA for experiment 2. In experiment 1, significant differences were found in the Jitter_rap(Z= -2.10, p=0.036). The findings indicated that the postmedicated voice was worse than premedicated voice. In experiment 2, there wasn't significant difference, but values of parameters related to voice quality(Jitter_rap, Shimmer_apq11, HNR, and H1-H2) showed changes toward stageⅣ, that is, the voice quality was worse under medication.
PDF

The Perceptual and Consonant Analysis for the Voice with Hypothyroidism (갑상선 기능저하 음성에 대한 청지각적 및 파열음 분석에 대한 연구)

Han, Baek Hwa;Lee, Dahae;Kim, Joon Sun;Hong, Ki Hwan
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.27 no.2
- /
- pp.95-101
- /
- 2016
Background and Objectives : The main purpose of this study is to clarify perceptual and acoustic analysis for the patients with hypothyroidism after thyroidectomy especially focused on the characteristics of speech articulation with special reference to the consonant production. Materials and Methods : The subjects of the research were 40 male and female adults (males : 5, females : 35). They were all received radioactive iodine treatment which after total thyroidectomy. Voice samples were collected during the three stages of after surgery, pre-radioisotope treatment (RIT), and post-RIT. The acoustic analysis was conducted by using Pratt (ver.5.2.21) after measuring voice onset time (VOT). The subjective evaluation of the voices used CAPE-V. Results : A significant decrease in overall severity was displayed in the CAPE-V following RIT. It may be conjectured that this is connected to the change in voice following RIT. The loudness of the sound displayed a significant decrease in the CAPE-V following RIT. It is conjectured that this is connected to the decrease in vocal intensity following RIT. No statistically significant results were revealed for the comparative analysis on the voice onset time (VOT) in all plosives during the three periods. Conclusion : Perceptually, the overall severity of the voice with hypothyroidism was changed significantly before and after RIT. Eventhough VOT were not significantly changed, it tended to decrease VOT in patients with hypothyroidism.
PDF

Intensity Characteristics of Korean Obstruents (한국어 장애음의 강도 특성)

Park Hansang
- MALSORI
- /
- no.47
- /
- pp.73-84
- /
- 2003
This study investigates differences in intensity across the three different Korean obstruent types in terms of the RMS amplitude of both the entire section and the first 512 samples of the immediately following vowel in two positions. The results showed that for the utterance initial position the RMS amplitude of both the entire section and the first 512 samples of the vowel was greatest for fortis obstruents, intermediate for aspirated ones, and weakest for lenis ones, with a significant difference between each pair of them. For the intervocalic position, in contrast, the intensity of the entire vowel was greatest for fortis obstruents, intermediate for lenis ones, and weakest for aspirated ones, with no significant difference between the last two groups, whereas the intensity of the first 512 samples of the vowel was greatest for fortis obstruents, intermediate for lenis ones, and weakest for aspirated ones, with a significant difference between each pair of the three groups. This means that the intensity of the earlier part of the vowel functions as a discriminator of Korean obstruents. The positional difference is due to the different behavior of the lenis obstruents in the intervocalic position, such that the intensity build-up is already on its way with voice lead.
PDF

Effects of Semi-Occluded Vocal Tract Exercise in Patients with Functional Aphonia (반폐쇄성도훈련이 기능적 실성증 환자의 음성 개선에 미치는 효과)

Chae, Hye Rim;Kim, Ji sung;Lee, Dong Wook;Choi, Soeng Hee
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.30 no.1
- /
- pp.48-52
- /
- 2019
Background and Objectives : Functional aphonia is characterized by incomplete closure of the vocal folds. Semi-occluded vocal tract exercise (SOVTE) allows smoothly vocal folds collision without damage to the vocal folds tissues to produce normal vocal intensity. The purpose of this study is to report the effect of SOVTE in patients with functional aphonia. Materials and Method : Seven patients diagnosed with functional aphonia were treated with 1-3 voice therapy sessions using voiced lip-trill, humming, Lax Vox in SOVTE. To assess the effectiveness of semi-occluded vocal tract exercise, cepstral analysis and auditory perceptual assessment were performed before and after voice therapy. Results : F0 (fundamental frequency), CPP (cepstral peak prominence) and L/H ratio (low/high spectral ratio) were significantly increased, while CPP Standard deviation, L/H ratio Standard deviation were decreased. In addition, 'Grade', 'Breathiness' and 'Asthenia' were significantly decreased in the GRBAS scale after SOVTE (p<0.05). Conclusion : In our study, SOVTE seemed to be effective to elicit voice quickly and promote vocal folds vibration without muscular effort in patients with functional aphonia.
PDF KSCI

Effect of Air Flow Change on Voice Parameters : In Vivo Canine Laryngeal Model (생체 발성모형에서 발성시 공기양의 변화가 음성 지표에 미치는 영향)

최홍식
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.5 no.1
- /
- pp.5-10
- /
- 1994
In vivo canine model was made in two mongrel dogs under the general Ⅰ-Ⅴ anesthesia. A vertical skin incision was made on the neck, the larynx and the trachea were dissected. Two tracheal openings were made : lower one for the insertion of the anesthesia tube and upper one for the delivery of air to the larynx to induce phonation. External branch of the superior laryngeal nerves and recurrent laryngeal nerves bilaterally were identified and stimulated electrically constantly. Subglottic pressure. fundamental frequency, intensity, and open quotient were measured when the air flow rate was varying low, medium and high. Glottic resistence was calculated. As the air flow rate was increased, the subglottic pressure and the sound intensity were increased. However, glottic resistance was decreased as the air flow was increased. In falsetto register, fundamenatal frequency was increased with the increment of air flow, but in modal register fundamental frequency was not increased statistically significant Open quotient by the electroglottography was increased according to the increment of airflow.
PDF

Comparison of voice range profiles of modal and falsetto register in dysphonic and non-dysphonic adult women (음성장애 성인 여성과 정상음성 성인 여성 간 진성구와 가성구의 음성범위프로파일 비교)

Jaeock Kim;Seung Jin Lee
- Phonetics and Speech Sciences
- /
- v.14 no.4
- /
- pp.67-75
- /
- 2022
This study compared voice range profiles (VRPs) of modal and falsetto register in 53 dysphonic and 53 non-dysphonic adult women with gliding vowel /a/'. The results shows that maximum fundamental frequency (F0_MAX), maximum intensity (I_MAX), F0 range (F0_RANGE), and intensity range (I_RANGE) are lower in the dysphonic group than in the non-dysphonic group. F0_MAX and F0_RANGE are significantly higher in falsetto register than modal register in both groups. I_MAX and I_RANGE are significantly higher in falsetto register in the non-dysphonic group, but those are not different between two registers in the dysphonic group. There was no statistically significant difference in minimum F0 (F0_MIN) and minimum intensity (I_MIN) between the two groups. Modal-falsetto register transition occurred at 378.86 Hz (F4#) in the dysphonic group and 557.79 Hz (C5#) in the non-dysphonic group, which was significantly lower in the dysphonic group. It can be seen that both modal and falsetto registers in dysphonic adult women are reduced compared to non-dysphoinc adult women, indicating that the vocal folds of dysphonic adult women are not easy to vibrate in high pitches. The results of this study would be the basic data for understanding the acoustic features of voice disorders.
https://doi.org/10.13064/KSSS.2022.14.4.067 인용 PDF KSCI

The Analysis of Tracheoesophageal Voice after Near-Total Laryngectomy and Implantation of Provox Prosthesis (후두근전적출술과 Provox 삽입술 후 기관식도발성에 관한 연구)

Choi, In-Ja;Choi, Young-Soo;Kim, Jin-Hwan;Ahn, Hwoe-Young
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.15 no.2
- /
- pp.141-144
- /
- 2004
Background and Objectives : To compare acoustic, aerodynamic analysis of voice and intelligibility score in patients with near-total laryngectomy and implantation of Provox prothesis. Material and Methods : In order to evaluate the voice characteristics, acoustic, aerodynamic parameter and speech intelligibility were measured in 5 patients after near-total laryngectomy, 5 patients after implantation of Provox prosthesis with total bility were measured in 5 patients after near-total laryngectomy, 5 patients after implantation of Provox prosthesis with total laryngectomy and 10 adults normal speaker. Acoustic analysis was carried out using CSL and aerodynamic analysis was carried out using Aerophon II. Speech sample was recorded and 10 listener was scored for speech intelligibility using a percentage of words correctly identified. Results. Fundamental frequency($F_0$), intensity, jitter, shimmer, maximal phonation time(MPT), subglottic air pressure were used for parameters for voice analysis. There were no significant difference between two group except on fundamental frequency and shimmer. The fundamental frequency was higher in patients with near-total laryngectomy and shimmer was higher in patients after implantation of Provox prosthesis with total laryngectomy. In addition, speech intelligibility was no significant difference between two groups. Conclusion : This results confirm that near-total laryngectomy and implantation of Provox prosthesis provides good voice rehabilitation.
PDF

Voice Analysis before and after Radioactive Iodine Ablation in Patients with Total Thyroidectomy (적갑상선 전절제술 환자의 방사성 동위원소치료 전.후 음성의 변화에 대한 연구)

Hong, Ki Hwan;Seo, Eun Ji;Lee, Hyun Doo;Yoon, Yun Sub;Lim, Seok Tae
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.24 no.1
- /
- pp.33-40
- /
- 2013
Background and Objectives:This study is to objectively compare and analyze the acoustic changes in the patients with total thyroidectomy before and after RI therapy. Subjects and Methods:For this study, a total of 50 patients with total thyroidectomy were participated as subjects. Voice samples were obtained at the time of post-operation (Post-OP), before high-dose radioactive iodine therapy (Pre-RIT), and after high-dose radioactive iodine therapy (Post-RIT). Acoustic analysis, the maximum phonation time and K-VHI (Korea-Voice handicap index) were used for subjective evaluation. Results:According to the comparison analysis of the three periods, mFo (Hz) was significantly reduced in all of the vowels /a/ and /i/ as the hormone was discontinued. This can be related to the reduction in vocal range. As thyroid hormone was discontinued, Shim (%) and APQ (%) values, which are the parameters related to the degree of aggressiveness, showed a significant increase in the middle vowel /a/. As thyroid hormone was discontinued, emotional index was significantly decreased in VHI (voice handicap index). Conclusion:These results can be assumed that thyroid hormone suspension is related to the increased changes in the vocal intensity, the increase in noise and the reduction in vocal range. Emotionally, these data can be assumed that the responsive factors of one's own voice disorders were significantly decreased in the patients with vocal handicap.
PDF

Search Result 118, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)