• 제목/요약/키워드: voice quality

검색결과 767건 처리시간 0.023초

식도발성화자 음성의 spectral & cepstral 분석 (Spectral and Cepstral Analyses of Esophageal Speakers)

  • 심희정;장효령;신희백;고도흥
    • 말소리와 음성과학
    • /
    • 제6권2호
    • /
    • pp.47-54
    • /
    • 2014
  • The purpose of this study was to analyze spectral versus cepstral measurements in esophageal speakers. The comparison between the measurements in thirteen male esophageal speakers was compared with the control group of thirteen normal speakers using the sustained vowel /a/. The main results can be summarized as below: (a) the CPP and L/H ratio of the esophageal group were significantly lower than those of the control group (b) the CPP was significantly correlated with the spectral parameters such as jitter, shimmer, NHR and VTI, and (c) the ROC analysis showed that the threshold of 10.25dB for the CPP achieved a good classification for esophageal speakers, with 100% perfect sensitivity and specificity. Thus, it was known that cepstral-based acoustic measures such as CPP, may be more reliable predictors than other spectral-based acoustic measures such as jitter and shimmer. And it was found that cepstral-based acoustic measures were effective in distinguishing esophageal voice quality from normal voice quality. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation with laryngectomees.

음성 패킷을 이용한 채널의 에러 정보 전달 (Transmission of Channel Error Information over Voice Packet)

  • 박호종;차성호
    • 한국음향학회지
    • /
    • 제21권4호
    • /
    • pp.394-400
    • /
    • 2002
  • 디지털 음성 통신에서 송신하는 음성 패킷의 전송 에러율을 알면 송신 채널 상황에 적합한 압축 동작을 통하여 전체 통신의 품질을 향상시킬 수 있다. 그러나 현재의 이동통신과 인터넷 통신에서는 음성 패킷의 전송 에러정보를 알려주는 프로토콜이 지원되지 않는다. 본 논문에서는 이를 해결하기 위하여 채널의 전송 에러 정보를 음성 패킷에 삽입하여 실시간으로 전달하는 방법을 제안한다. 제안하는 채널 에러 정보 삽입 방법은 ACELP (algebraic code-excited linear predictin) 코드벡터의 펄스 위치의 상관 관계를 이용하며, 이를 통하여 추가정보 삽입에 의한 음질 저하를 막고 오인식율을 줄일 수 있다. 다양한 음성 데이터를 이용하여 제안한 방법의 성능을 측정하였으며 음질의 저하가 거의 발생하지 않고 정보의 검출 능력과 오인식율에서 만족할 만한 성능을 가지는 것을 확인하였다.

딥러닝 기반 가창 음성합성(Singing Voice Synthesis) 모델링 (Deep Learning based Singing Voice Synthesis Modeling)

  • 김민애;김소민;박지현;허가빈;최윤정
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 추계학술대회
    • /
    • pp.127-130
    • /
    • 2022
  • 본 논문은 생성자 손실함수를 이용한 가창 음성합성 모델링에 대한 연구로서 기존 이미지 생성에 최적화된 딥러닝 알고리즘 중 BEGAN모델을 오디오 생성모델(SVS모델)에 적용시킬 때 발생할 수 있는 여러 요인에 대해 분석하고 최적의 품질을 도출하기 위한 실험을 수행하였다. 특히 BEGAN 기반 모델에서 제안된 L1 loss가 어느 시점에서 감마(𝛾)파라미터의 역할을 상실하게 한다는 점을 개선하고자 알파(𝛼)파라미터를 추가한 후 각 파라미터 값들의 구간별 실험을 통해 최적의 값을 찾아냄으로써 가창합성 생성물의 품질향상에 기여할 수 있음을 확인하였다.

  • PDF

음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정 (Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS))

  • 김한수;최성희;김재인;임재열;최홍식
    • 대한후두음성언어의학회지
    • /
    • 제15권2호
    • /
    • pp.92-97
    • /
    • 2004
  • Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.

  • PDF

성대용종 환자의 음성치료 효과 (The Effect of Voice Therapy in Vocal Polyp Patients)

  • 김성태;정고은;김상윤;최승호;임길채;한주희;남순열
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.43-49
    • /
    • 2009
  • Vocal polyps are benign phonotraumatic lesions which are traditionally treated using phonomicrosurgical techniques. In the case of hyperfunctional voice use, voice therapy is effective and results in voice improvement. However, the utility of voice therapy about vocal polyp is in great demand. The purpose of this study was to evaluate the effects of voice therapy in patients with vocal polyps. The authors reviewed the medical records of 193 patients with vocal nodules or vocal polyps, and 64 patients (31 nodules and 33 polyps) were enrolled. All of the subjects had received explanation of problems, vocal hygiene education, and been treated by the $SKMVTT^{(R)}$ (Seong-Tae Kim's multiple voice therapy technique) ranging from 4 to 16 sessions (mean: 8.6 sessions). All subjects were examined by perceptual assessment, acoustic and aerodynamic measures, and VRP (voice range profile). In perceptual assessment, patients with vocal nodules had more breathy and strained voices than the vocal polyp group. Both groups significantly reduced rough, breathy voice after voice therapy. Patients with vocal polyps had worse voice quality than patients with nodules in acoustic measures. Both groups showed reduced jitter and shimmer after voice therapy. In aerodynamic measures, MPT and Psub were increased, and MFR was reduced (p<.05). Participants' frequency range and intensity range were increased after voice therapy, but only frequency range resulted in a significant difference (p<.05). In conclusion, the therapeutic effect of voice therapy in patients with vocal nodules and polyps was demonstrated perceptually and acoustically. We can suggest that voice therapy, including advice, vocal hygiene, and $SKMVTT^{(R)}$ is a useful as an initial choice of treatment for patients with vocal polyps before considering a surgical approach.

  • PDF

Cannula-typed Silicone Voice Prosthesis(소망$\circledR$)의 개발 (Development of Cannula-typed Silicone Voice Prosthesis(So-Mang$\circledR$))

  • 최홍식;정은주;전희선;문인석;김영호;김광문
    • 대한후두음성언어의학회지
    • /
    • 제12권2호
    • /
    • pp.152-157
    • /
    • 2001
  • Background : Electrolarynx, Esophageal voice, and Silicone voice prosthesis with tracheoesophageal(T-E) fistula have been used as vocal rehabilitating methods for the post-laryngectomized patients. Prosthetic rehabilitation of voice after total laryngectomy has gained wide acceptance and has become a common practice in many clinics since the pioneering works of Singer and Blom In 1979. Since the introduction of tracheo-esophageal puncture and application of Blom Singer$\circledR$ voice prosthesis in 1980, several reliable voice prostheses have been developed and are successfully being used. Objectives : Even though quality of voice produced by Silicone voice prosthesis with T-E fistula is superior to other modalities, it still has some disadvantages. We devised a new cannulatyped silicone voice prosthesis. Methods : 1) Devising a new prototype of cannula-typed silicone voice prosthesis. 2) Application of the prototype using canine animal model(laryngectormized dog) and fitting trial on human patient whose previously inserted Silicone voice prosthesis is not functioning due to presumed fungal infection. Discussion : Final form of prototype was made after several times of major and minor modifications. Insertion of the newly developed Cannula-typed Silicone voice prosthesis on canine animal model and human trial were done without any difficulty. There were no serious leakage of saliva or food during swallowing. Conclusion : The newly developed Cannula-typed Silicone voice prosthesis(So-Mang$\circledR$) and the modified replacement method will further improve the results of post-laryngectomized prosthetic voice rehabilitation. Long-term animal study and human trial are planned in the near future.

  • PDF

성악과 실용음악 보컬 전공 대학생들의 주관적 음성평가 비교 예비연구 (Preliminary Study for Comparison of Subjective Voice Evaluations among Vocal and Applied Music Major Students)

  • 이다혜;황영진;김재옥
    • 말소리와 음성과학
    • /
    • 제6권2호
    • /
    • pp.37-45
    • /
    • 2014
  • The purpose of this study was to determine whether the Korean Singing Voice Handicap Index (K-SVHI) was suitable for singers in other genres than vocal music to assess their vocal problems subjectively. Twenty six college students majoring in vocal music and twenty six students majoring in applied music were included in the study. They were divided into G0 and G1 in voice quality using the GRBAS scale during the tasks of singing. K-SVHI was divided into three sub-areas (Physical, Functional, and Emotional). In the singing task, both groups showed no significant difference between K-SVHI scores by G scale. In the reading task, the vocal music group had significantly higher K-SVHI in G0 than in G1 in K-SVHIs by G scale, while the applied vocal music group had significantly higher K-SVHI in G1 than in G0. Also, the two groups were not significantly different in G0, G1 in the singing task while the vocal music group showed higher K-SVHI than the applied vocal music group in G0 in the reading task. In addition, the vocal music group had higher K-SVHI than the applied vocal music group in G1 in both tasks. As comparing by groups in three sub-areas of K-SVHI, significant differences were found in the Emotional and Functional area. Those results showed that singers felt their voice problems differently by musical genres, which means that K-SVHI may not be a proper tool for evaluating voice handicap of singers in diverse voice music genres.

${\beta}_2$-촉진제 사용전후에 따른 만성폐쇄성폐질환/천식 환자의 음성 연구 (A Study about Voice of Patients with Chronic Obstructive Pulmonary Disease/Asthma before & after ${\beta}_2$-agonist)

  • 강영애;김세훈;정성수;이태용;성철재
    • 말소리와 음성과학
    • /
    • 제2권2호
    • /
    • pp.101-108
    • /
    • 2010
  • An inhaled salbutamol and salmeterol for chronic obstructive pulmonary disease(COPD) and asthma have been used worldwidely. But there has been few study about the voice change evoked from the post-medicine effect. To evaluate the voice influenced of short-acting and long-acting ${\beta}_2$-agonists, two experiments were carried out: one was salbutamol experiment 1 with eight patients, the other was salmeterol experiment 2 with six patients. Experiment 1 was made of two stages: premedication & postmedication. Experiment 2 was four stages: stageI was premedication, stageII was postmedication & pregaggling, stageIII was postmedication & postgaggling(100 ml with water), and stageIV was postmedication & 30 minutes later. Measured parameters were F0, F0_SD, Jitter_rap, Shimmer_apq11, HNR, BW(1, 2, 3), Intensity, and H1-H2. The mean data collected from 3 repetitions each was statistically analyzed by Wilcoxon signed rank test for experiment 1 and repeated measures ANOVA for experiment 2. In experiment 1, significant differences were found in the Jitter_rap(Z= -2.10, p=0.036). The findings indicated that the postmedicated voice was worse than premedicated voice. In experiment 2, there wasn't significant difference, but values of parameters related to voice quality(Jitter_rap, Shimmer_apq11, HNR, and H1-H2) showed changes toward stageⅣ, that is, the voice quality was worse under medication.

  • PDF

음질, 운율, 발음 특징을 이용한 마비말장애 중증도 자동 분류 (Automatic severity classification of dysarthria using voice quality, prosody, and pronunciation features)

  • 여은정;김선희;정민화
    • 말소리와 음성과학
    • /
    • 제13권2호
    • /
    • pp.57-66
    • /
    • 2021
  • 본 논문은 말 명료도 기준의 마비말장애 중증도 자동 분류 문제에 초점을 둔다. 말 명료도는 호흡, 발성, 공명, 조음, 운율 등 다양한 말 기능 특징의 영향을 받는다. 그러나 대부분의 선행연구는 한 개의 말 기능 특징만을 중증도 자동분류에 사용하였다. 본 논문에서는 음성의 장애 특성을 효과적으로 포착하기 위해 마비말장애 중증도 자동 분류에서 음질, 운율, 발음의 다양한 말 기능 특징을 반영하고자 하였다. 음질은 jitter, shimmer, HNR, voice breaks 개수, voice breaks 정도로 구성된다. 운율은 발화 속도(전체 길이, 말 길이, 말 속도, 조음 속도), 음높이(F0 평균, 표준편차, 최솟값, 최댓값, 중간값, 25 사분위값, 75 사분위값), 그리고 리듬(% V, deltas, Varcos, rPVIs, nPVIs)을 포함한다. 발음에는 음소 정확도(자음 정확도, 모음 정확도, 전체 음소 정확도)와 모음 왜곡도[VSA(vowel space area), FCR (formant centralized ratio), VAI(vowel articulatory index), F2 비율]가 있다. 본 논문에서는 다양한 특징 조합을 사용하여 중증도 자동 분류를 시행하였다. 실험 결과, 음질, 운율, 발음 특징 세 가지 말 기능 특징 모두를 분류에 사용했을 때 F1-score 80.15%로 가장 높은 성능이 나타났다. 이는 마비말장애 중증도 자동 분류에는 음질, 운율, 발음 특징이 모두 함께 고려되어야 함을 시사한다.

Mobile WiMAX에서 IPTV 및 VoIP 음성서비스 품질을 고려한 수면구간 길이와 지터버퍼 크기의 상관관계 분석 (Analysis of Correlation between Sleep Interval Length and Jitter Buffer Size for QoS of IPTV and VoIP Audio Service over Mobile WiMax)

  • 김형석;김태현;황호영
    • 정보처리학회논문지C
    • /
    • 제17C권3호
    • /
    • pp.299-306
    • /
    • 2010
  • IPTV 및 VoIP 서비스는 높은 이동성과 전송 속도를 보장하는 Mobile WiMAX 네트워크 상에서 제공할 수 있는 유용한 응용 서비스들이다. IPTV의 오디오 전송이나 VoIP의 통화 품질에 영향을 미치는 요소 중 전송 경로의 잦은 변경이나 경로간 전송 시간의 차이에 따라 발생하는 지터에 의한 패킷 손실은 지터 버퍼를 이용하여 완화할 수 있다. 본 논문에서는 Mobile WiMAX 네트워크 상에서 이동 단말의 전력 소모 절감을 위해 사용되는 PSC-II 모드를 사용할 때의 오디오 및 음성 서비스의 품질(Quality of Service)과 지터 버퍼 크기의 상관관계에 대해 연구, 분석한다. 이를 위해 절전 모드 사용으로 인해 추가로 발생하는 지연 시간을 포함한 서비스의 종단간 지연시간 모델과 종단간 지연시간을 기준으로 한 서비스 품질 기준을 제시하였다. 또한, 제시한 모델의 다양한 파라미터에 따른 시뮬레이션 분석 결과를 통해 절전 모드를 사용할 경우에는 지터 버퍼의 크기 증가에 따른 지연으로 인한 패킷 손실이 오히려 오디오 및 VoIP 서비스 품질 측면에서 좋지 않은 영향을 미칠 수 있음을 보였다.