• Title/Summary/Keyword: 발성율

Search Result 218, Processing Time 0.029 seconds

The Modeling of Pause Duration For Text-To-Speech Synthesis System (TTS 시스템을 위한 휴지기간 모델링)

  • Chung Jihye;Lee Yanhee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.83-86
    • /
    • 2000
  • 본 논문에서는 비정형 단위를 사용한 음성 합성 시스템의 합성음에 대한 자연성을 향상시키기 위한 휴지 구간 추출 및 휴지 지속시간 예측 모델을 제안한다. 제안된 휴지 지속시간 예측 모델은 트리 기반 모델링 기법 중 하나인 CART (Classification And Regression Trees)방법을 이용하였다. 이를 위해 남성 단일 화자가 발성한 6,220개의 어절경계 포함하는 총 400문장의 문 음성 데이터베이스를 구축하였고, 이 데이터베이스로부터 V-fold Cross-Validation 방법에 의해 최적의 트리를 결정하였다. 이 모델을 평가한 결과, 휴지 구간 추출 정확율은 $81\%$로 휴지 구간 존재 추출 정확율은 $83\%, 휴지 구간 비존재 추출 정확율은 $80\%이었고, 실 휴지지속시간과 예측 휴지지속시간과의 다중상관 계수는 0.84로, 오차 범위 20ms 이내에서 의 정 확율은 $88\%$ 이었다. 또한, 휴지지속시간을 예측하여 적용한 합성음을 청취 실험한 결과 자연 음성과 대체적으로 유사하게 나타났다.

  • PDF

Large Vocabulary Continuous Speech Recognition using Stochastic Pronunciatioin Lexicon Modeling (확률 발음사전을 이용한 대어휘 연속음성인식)

  • 윤성진
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.315-319
    • /
    • 1998
  • 대어휘 연속음성인식을 위한 확률 발음사전 모델에 대해서 제안하였다. 제안된 확률 발음 사전은 연속음성과 같은 자연스런 발성에서 자주 발생되는 단어의 변이를 확률적인 subword-state로 이루어진 HMM으로 모델화 함으로써 단어의 발음 변이를 효과적으로 표현할 수 있으며, 단위 인식 시스템의 성능을 보다 높일 수 있도록 구성되었다. 확률 발음사전의 생성은 음성 자료와 음소 모델을 이용하여 단어 단위의 분할과 학습을 통해서 자동으로 생성되게 됨 음소와 같은 언어학적인 단위뿐만 아니라 PLU 이나 비언어학적인 인식 모델을 이용한 연속음성인식기에도 적용이 가능하다.연속음성인식실험결과 확률 발음사전을 사용함으로써 표준 발음 표기를 사용하는 인식 시스템에 비해 단어 오류율은 39.8%, 문장 오류율은 24.4%의 큰 폭으로 오류율을 감소시킬 수 있었다.

  • PDF

Recurrence and Extraneural Metastasis in 31 Meningeal Hemangiopericytomas (31예 수막 혈관외피세포종에 있어서의 재발 및 신경계외 전이)

  • Kim, Jeong Hoon;Kim, Joon Soo;Kim, Chang Jin;Hwang, Sung Kyun;Jung, Hee Won;Kwun, Byung Duk
    • Journal of Korean Neurosurgical Society
    • /
    • v.30 no.3
    • /
    • pp.349-357
    • /
    • 2001
  • Purpose : Meningeal hemangiopericytoma(M-HPC), characterized by a high local recurrency and metastatic potential, is a rare neoplasm arising from perivascular pericytes. A retrospective study was performed to identify the recurrence and extraneural metastasis in M-HPC. Materials and Methods : We reviewed the records of 31 M-HPC patients treated from 1982 through 1999 at our institution. The time to recurrence and the various parameters affecting recurrence were determined. Extreneural metastasis was also analyzed. Results : The rate of local recurrency was 38.7%(12/31). The overall average recurrence-free period(RFP) before the first recurrence was 104 months, with overall recurrence-free rates(RFRs) at 5 and 10 years after first surgery of 59.2% and 33.6%, respectively. Of the 12 patients who experienced local recurrence, 4 had recurrences 5 years later after the first surgery. Complete excision at the first operation significantly extended the average time before first recurrence from 43 to 111 months. The 5-year RFRs for the groups of complete excision and incomplete excision were 72.7% and 20.8%, respectively(p=0.0060). Although there was no statistical significance, complete excision followed by adjuvant radiotherapy of more than 50Gy extended the RFP. The 5-year RFRs for the groups of complete excision and complete excision with adjuvant radiotherapy were 70.3% and 100%, respectively(p=0.3359). Four patients(12.9%) presented one or more extraneural metastases that were developed at an average of 107 months after the first operation with the 5- and 10-year metastasis rates of 4.4% and 24.9%, respectively. Conclusions : M-HPC has a propensity to recur either locally or at distant sites after surgical resection. Complete excision is the most important factor to reduce recurrence. However, even with complete excision, adjuvant radiotherapy of more than 50Gy significantly reduces the risk of recurrence. Local and distant recurrences may occur after a prolonged disease-free interval, emphasizing the need for long-term follow-up.

  • PDF

Performance Improvement of Connected Digit Recognition by Considering Phonemic Variations in Korean Digit and Speaking Styles (한국어 숫자음의 음운변화 및 화자 발성특성을 고려한 연결숫자 인식의 성능향상)

  • 송명규;김형순
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.401-406
    • /
    • 2002
  • Each Korean digit is composed of only a syllable, so recognizers as well as Korean often have difficulty in recognizing it. When digit strings are pronounced, the original pronunciation of each digit is largely changed due to the co-articulation effect. In addition to these problems, the distortion caused by various channels and noises degrades the recognition performance of Korean connected digit string. This paper dealt with some techniques to improve recognition performance of it, which include defining a set of PLUs by considering phonemic variations in Korean digit and constructing a recognizer to handle speakers various speaking styles. In the speaker-independent connected digit recognition experiments using telephone speech, the proposed techniques with 1-Gaussian/state gave string accuracy of 83.2%, i. e., 7.2% error rate reduction relative to baseline system. With 11-Gaussians/state, we achieved the highest string accuracy of 91.8%, i. e., 4.7% error rate reduction.

Improvements on Speech Recognition for Fast Speech (고속 발화음에 대한 음성 인식 향상)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.2
    • /
    • pp.88-95
    • /
    • 2006
  • In this Paper. a method for improving the performance of automatic speech recognition (ASR) system for conversational speech is proposed. which mainly focuses on increasing the robustness against the rapidly speaking utterances. The proposed method doesn't require an additional speech recognition task to represent speaking rate quantitatively. Energy distribution for special bands is employed to detect the vowel regions, the number of vowels Per unit second is then computed as speaking rate. To improve the Performance for fast speech. in the pervious methods. a sequence of the feature vectors is expanded by a given scaling factor, which is computed by a ratio between the standard phoneme duration and the measured one. However, in the method proposed herein. utterances are classified by their speaking rates. and the scaling factor is determined individually for each class. In this procedure, a maximum likelihood criterion is employed. By the results from the ASR experiments devised for the 10-digits mobile phone number. it is confirmed that the overall error rate was reduced by $17.8\%$ when the proposed method is employed

A Study on BTEX Removal Efficiency for Variation of Moistures by Microwave Process (유류오염토의 마이크로파 처리 시 토양의 함수율 변화에 따른 BTEX 제거특성에 관한 연구)

  • Ha, Sang-An;Yeom, Hae-Kyong;Yu, Mi-Yong
    • Journal of Soil and Groundwater Environment
    • /
    • v.12 no.2
    • /
    • pp.65-71
    • /
    • 2007
  • This study has been focused on an application of microwave pre-treatment of soil contaminated with volatile organic chemicals, and BTEX(benzene, toluene, ethylene, xylene). Microwave experiments were carried out under different power conditions (2 kW, 4 kW) using different moisture contents and BTEX concentration. According to these results of this study, the most BTEX removal efficiency was with 20% moisture contents regardless of electric power. The result show that 2kW was determined to the optimum electric power at $10{\sim}30%$ moisture contents, but the optimum power was 4 kW at 50% of moisture content.

Effects of Fractionated Stereotactic Radiotherapy for Primary Hepatocellular Carcinoma (원발성 간암의 분할 정위방사선치료 효과)

  • Choi Byeong Ock;Kang Ki Mun;Jang Hong Seok;Lee Snag-wook;Kang Young Nam;Chai Gyu Young;Choi Ihl Bhong
    • Radiation Oncology Journal
    • /
    • v.23 no.2
    • /
    • pp.92-97
    • /
    • 2005
  • Purpose : Reports on the outcome of curative radiotherapy for the primary hepatocellular carcinoma (HCC) are rarely encountered in the literature. in this study, we report our experience of a clinical trial where fractionated stereotactic radiotherapy (SRT) was used in treating a primary HCC. Materials and Methods : A retrospective analysis was peformed on 20 patients who had been histologically diagnosed as HCC and treated by fractionated SRT. The long diameter of tumor measured by CT was $2\~6.5$ cm (average: 3.8 cm). A single dose of radiation used in fractionated SRT was S or 10 Gy: each dose was prescribed based on the planning target volume and normalized to $85\~99\%$ isocenter dose. Patients were treated $3\~5$ times per week for 2 weeks, with each receiving a total dose of 50 Gy (the median dose: 50 Gy). The follow up period was $\~55$ months (the median follow up period: 23 months). Results : The response rate was $50\%$ (12 patients), with 4 patients showing complete response ($20%$), 8 patients showing partial response ($40\%$), and 8 patients showing stable disease ($40\%$). The 1-year and 2-year survival rates were $70.0\%$ and $43.1\%$, respectively, and the median survival time was 20 months. The 1-year and 2-year disease free survival rates were $65\%$ and $32.5\%$, respectively, and the median disease-free survival rate was 19 months. Some acute complications of the treatment were noted as follows: dyspepsia in 12 patients ($60\%$), nausea/emesis in 8 patients ($40\%$), and transient liver function impairment in 6 patients ($30\%$). However, there was no treatment related death. Conclusion : The study indicates that fractionated SRT is a relatively safe and effective method for treating primary HCC. Thus, fractionated SRT may be suggested as a local treatment for HCC of small lesion and containing a single lesion, when the patients are inoperable or operation is refused by the patients. We thought that fractionated SRT is a challenging treatment modality for the HCC.

Effect of Air Flow Change on Voice Parameters : In Vivo Canine Laryngeal Model (생체 발성모형에서 발성시 공기양의 변화가 음성 지표에 미치는 영향)

  • 최홍식
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.5 no.1
    • /
    • pp.5-10
    • /
    • 1994
  • In vivo canine model was made in two mongrel dogs under the general Ⅰ-Ⅴ anesthesia. A vertical skin incision was made on the neck, the larynx and the trachea were dissected. Two tracheal openings were made : lower one for the insertion of the anesthesia tube and upper one for the delivery of air to the larynx to induce phonation. External branch of the superior laryngeal nerves and recurrent laryngeal nerves bilaterally were identified and stimulated electrically constantly. Subglottic pressure. fundamental frequency, intensity, and open quotient were measured when the air flow rate was varying low, medium and high. Glottic resistence was calculated. As the air flow rate was increased, the subglottic pressure and the sound intensity were increased. However, glottic resistance was decreased as the air flow was increased. In falsetto register, fundamenatal frequency was increased with the increment of air flow, but in modal register fundamental frequency was not increased statistically significant Open quotient by the electroglottography was increased according to the increment of airflow.

  • PDF

Long Term Result and Clinical Evaluation of Primary Non-Small Cell Lung Cancer (원발성 비소세포성 폐암의 임상적 고찰과 장기성적)

  • 김양원;김윤규
    • Journal of Chest Surgery
    • /
    • v.29 no.1
    • /
    • pp.43-51
    • /
    • 1996
  • From march 1989 to October 1993, 57 patients were diagnosed and operated for primary non-small cell lung cancer, and evaluated clinically. 1. There were 45 males and 12 females (M:F=3.8:1), and the peak incidence of age was 6th decade of life (45.6%). In the preoperative diagnostic methods and their positive rate, sputum cytology was 11%, bronchial washing cytology 50%, bronchoscopic biopsy 73%, and CT guided percutaneous needle aspiration biopsy 83%. 3. Histopathologically, squamous cell carcinoma was 56.1%, adenocarcinoma 22.8%, bronchioloal veolar cell carcinoma 1%, and undifferentiated large cell carcinoma 1.8%. 4. In the operation, pneumonectomy was 35.1%, lobectomy 38.6%, bilobectomy 3.5%, segmentec tony 7%, and exploratory thoracotomy 15.8%, and overall resectability was 84.2%. 5. In postoperative stagings, stage I was 28.1%, st ge II 22.8%, stage IIIa 31.6% and stage IIIb 17.5%. 6. Postoperative complications were developed in 11 cases (19.3%) and operative mortality was none. 7. One year survival rate in rejectable cases was 87.0%, 2 year 61.6% and 5 year 44.9%. According to stage, 3 year survival rate was 75.8% in stage I, 16.9% in stage II, 60.9% in stage IIIa, 50% in stage IIIb.

  • PDF

The Effectiveness of Electroglottographic Parameters in Differential Diagnosis of Laryngeal Cancer (후두암 감별진단에 있어 성문전도(Electroglottograph) 파라미터의 유용성)

  • 송인무;고의경;전경명;권순복;김기련;전계록;김광년;정동근;조철우
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.14 no.1
    • /
    • pp.16-25
    • /
    • 2003
  • Background and Objectives : Electroglottography(EGG) is a non-invasive method of monitoring the vocal cord vibration by measuring the variation of physiological impedance across the vocal folds through the neck skin. It reveals especially the vocal fold contact area and is widely used for basic laryngeal researches, voice analysis and synthesis. The purpose of this study is to investigate the effectiveness of EGG parameters in differential diagnosis of laryngeal cancer. Materials and Methods : The author investigated 10 laryngeal cancer and 25 benign laryngeal disease patients who visited at the Department of Otolaryngology, Pusan National University Hospital. The EGG equipment was devised in the author's Department. Among various parameters of EGG, closed quotient(CQ), speed quotient(SQ), speed index(SI), Jitter, Shimmer, Fo were determined by an analysis program made with MATLAB 6.5$^{\circledR}$(Mathwork, Inc.). In order to differentiate various laryngeal diseases from pathologic voice signals, the author has used the electroglottographic parameters using the neural network of multilayer perceptron structure. Results : SQ, SI, Jitter and Shimmer values except those of CQ and Fo showed remarkable differences between benign and malignant laryngeal disease groups. From the artificial neural network, the percentage of differentiating the laryngeal cancer was over 80% in SQ, SI, Jitter, Shimmer except for CQ and Fo. These results indicated that it is possible to discriminate the benign and malignant laryngeal diseases by EGG parameters using the artificial neural network. Conclusion : If parameters of EGG which can reveal for the pathology of laryngeal diseases are additionally developed and the current classification algorithm is improved, the discrimination of laryngeal cancer will become much more accurate.

  • PDF