Search | Korea Science

Spectral Shape Invariant Real-time Voice Change System (스펙트럼 형태 불변 실시간 음성 변환 시스템)

Kim Weon-Goo
- Journal of the Korean Institute of Intelligent Systems
- /
- v.15 no.1
- /
- pp.48-52
- /
- 2005
In this paper, the spectral shape invariant real-time voice change method is proposed to change one's voice to mechanical voice. For this purpose, LPC analysis and synthesis is used to maintain the spectraum of voice and the pitch of synthesis speech can be changed freely. In the proposed method, gain matching method is applied to excitation signal generator to make the changed voice natural to hear. In order to evaluate the performance of the proposed method, voice change experiments were conducted. Experimental results showed that original speech signal is changed to the mechanical voice signal in which context of the speaker's voice is conveyed correctly in spite of drastic change of pitch. The system is implemented using TI TMS320C6711DSK board to verify the system runs in real time.
https://doi.org/10.5391/JKIIS.2005.15.1.048 인용 PDF KSCI

Extraction and Analysis of Voice Feature Parameter of Chungbuk News Announcers (충북방송 뉴스 진행자의 음성적 특징 추출 및 분석)

Kim, Bong-Hyun;Lee, Se-Hwan;Ka, Min-Kyoung;Cho, Dong-Uk;J.Bae, Young-Lae
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.11a
- /
- pp.363-364
- /
- 2009
방송 산업이 기술적 구조적으로 발전하고 시청자의 수준 향상 및 문화 산업이 급변함에 따라 현대사회에서 방송 분야는 거대 성장을 거듭하고 있다. 이러한 방송 산업의 시대적 변화속에서 지속적으로 관심의 대상이 되고 있는 것이 시청자들의 수준 및 변화의 초점이며 이를 파악하여 원활한 방송의 진행을 주도해야 하는 것이 방송 진행자의 역할이다. 따라서 본 논문에서는 충북지역의 방송 3사에서 뉴스를 담당하고 있는 진행자에 대한 음성을 수집하여 다양한 음성 분석 요소들을 적용하고 이에 따른 결과값을 기반으로 방송 진행자의 음성에 대한 특징적 정보를 추출하는 실험을 수행하였다. 특히, 음성을 통해 전달할 수 있는 영향력을 분석하기 위해 피치, 지터, 짐머, 안정도, 및 스펙트로그램 등의 다양한 음성 분석 요소를 적용하였으며 결과값에 대한 비교, 분석을 수행하였다.
https://doi.org/10.3745/PKIPS.y2009m11a.363 인용 PDF

Analysis of Voice Color Similarity for the development of HMM Based Emotional Text to Speech Synthesis (HMM 기반 감정 음성 합성기 개발을 위한 감정 음성 데이터의 음색 유사도 분석)

Min, So-Yeon;Na, Deok-Su
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.15 no.9
- /
- pp.5763-5768
- /
- 2014
Maintaining a voice color is important when compounding both the normal voice because an emotion is not expressed with various emotional voices in a single synthesizer. When a synthesizer is developed using the recording data of too many expressed emotions, a voice color cannot be maintained and each synthetic speech is can be heard like the voice of different speakers. In this paper, the speech data was recorded and the change in the voice color was analyzed to develop an emotional HMM-based speech synthesizer. To realize a speech synthesizer, a voice was recorded, and a database was built. On the other hand, a recording process is very important, particularly when realizing an emotional speech synthesizer. Monitoring is needed because it is quite difficult to define emotion and maintain a particular level. In the realized synthesizer, a normal voice and three emotional voice (Happiness, Sadness, Anger) were used, and each emotional voice consists of two levels, High/Low. To analyze the voice color of the normal voice and emotional voice, the average spectrum, which was the measured accumulated spectrum of vowels, was used and the F1(first formant) calculated by the average spectrum was compared. The voice similarity of Low-level emotional data was higher than High-level emotional data, and the proposed method can be monitored by the change in voice similarity.
https://doi.org/10.5762/KAIS.2014.15.9.5763 인용 PDF KSCI

Performance Analysis of Speech Recognition by Increasing the Number of Reference Speaker (피춰 추출 관점에서 기준 화자 수 증가에 따른 음성 인식 성능 분석)

이철희
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 1998.06a
- /
- pp.111-114
- /
- 1998
음성을 인식하기 위해서는 주어진 음성을 미리 정한 기준 음성과 비교하여 가장 유사한 것을 갖는 과정을 거치게 된다. 같은 단어라도 화자에 따라서 발음 속도, 음의 강약이 틀리므로 화자 독립 음성 인식을 위해서는 여러 화자가 발음한 음성을 기준 음성으로 사용하여 인식 성능을 향상시킬 수 있다. 그러나 화자 수를 증가시켜도 인식 성능의 향상에는 한계를 보이고 있다. 이러한 문제점은 현재 음성에서 추출되는 피춰가 인식에 필요한 정보를 충분히 포함하지 않는 것과 인식 알고리즘의 효율성 등에서 원인을 찾을 수 있다. 본 논문에서는 남자 10명과 여자 10명이 발음한 한국어 숫자음을 인식 대상으로 하여 멜켑스트럼을 추출하고 DTW에 의해 인식을 수행하여 피춰 추출의 관점에서 화자 수 증가에 따른 인식률의 변화와 그 한계에 대해서 분석한다.
PDF

The action of laryngeal and strap muscle on pitch control (후두근 및 경부근이 pitch 조절에 미치는 영향)

홍기환;김영중;전동석
- Proceedings of the KSLP Conference
- /
- 1993.12a
- /
- pp.10-10
- /
- 1993
발성시 피치의 조절기전에는 많은 영향들이 있겠으나 대표적으로 성대의 긴장도, 용량 및 길이의 변화가 대표적인 요소라 하겠으며 그중 대표적인 요소가 긴장도의 변화라 하겠다. 성대의 긴장에 미치는 영향으로는 먼저 후두내의 요소로는 잘 알려진 대로 성대근에 의한 내적인 긴장도의 증대와 윤상갑상근에 의한 길이의 증대에 의한 외적인 긴장도의 중대등이 피치의 변화를 일으키는 요소이며 또한 후두외적인 요소로서 설골상부근과 설골하부근이 피치에 영향을 준다는 사실은 잘 알려진 사실이다.(중략)
PDF

Quantification of Glottal Cycle According to the Variation of Frequency and Intensity in Normal Speaker (발성의 강도와 주파수 변화에 따른 성대 움직임의 정량적 분석)

손영익;이경아;류준선;백정환
- Proceedings of the KSLP Conference
- /
- 1996.11a
- /
- pp.92-92
- /
- 1996
비디오스트로보스코피 화상의 정량화를 통한 glottal cycle의 객관적인 평가는 여러 질환의 감별 및 치료전후의 결과를 비교하는데 중요한 역할을 담당할 수 있으리라 사려되나 아직은 정상 발성시나 병적인 조건에서의 참고치나 그 의미에 대하여 보고된 경우는 흔치 않은 실정이다. 이에 저자들은 정상성인을 대상으로 발성의 주파수와 강도의 변화에 따른 glottal cycle의 변화를 정량화 함으로써 추후 연구나 임상적용 둥의 기본자료로서 활용하고자 하였다. (중략)
PDF

A Study on the Visual Speech Recognition based on the Variations of Lip Shapes (입모양 변화에 의한 영상음성 인식에 관한 연구)

이철우;계영철
- Proceedings of the KAIS Fall Conference
- /
- 2001.05a
- /
- pp.188-191
- /
- 2001
본 논문에서는 화자의 입모양의 변화를 분석하여 발음된 음성을 인식하는 방법에 관하여 연구하였다. 입모양 변화를 나타내는 특징벡터의 서로 다른 선택이 인식성능에 미치는 영향을 비교 분석하였다. 특징벡터로서는 ASM(Active Shape Model) 파라메터와 Acticulatory 파라메터를 특별히 선택하여 인식성능을 비교하였다. 모의실험 결과, Articulatory 파라메터를 사용하는 것이 인식성능도 더 우수하고 계산량도 더 적음을 확인할 수 있었다.

A Case Report Diagnosed with Rosai-Dorfman Disease by Voice Change (음성변화로 진단된 Rosai-Dorfman병 1예)

Hwang, Hye Jin;Lee, Eun Jung;Choi, Sung Eun;Choi, Hong-Shik
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.25 no.1
- /
- pp.42-46
- /
- 2014
Rosai-Dorfman disease is a rare disorder of unknown of etiology and is usually associated with benign proliferation of hematopoietic and fibrous tissue that often manifest in the head and neck region. We report a case of extranodal Rosai-Dorfman disease presenting in the neck, subglottis and nasal floor diagnosed by voice change.
PDF

Voice Source Estimation Using Robust Sequential SVD (견실 순차 특이치분해를 이용한 음원추정)

홍성훈
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1993.06a
- /
- pp.75-79
- /
- 1993
본 논문에서는 변화가 심한 음원파형을 추정하는 새로운 순차처리 알고리듬을 제안한다. 먼저, 1) 기존의 순차처리 분석법중 대표적인 분석법인 RLS(recursive least square)의 문제점들을 검토하고, 2) 이를 개선하기 위해서 관측행렬(observation matrix)을 최적차수의 SVD(reduced-rank singular value decomposition)로 재구성하고, 3) 이에 견실개념(robustness concept)을 적용해서 최적의 성도변수(vocal tract parameter)를 찾아내고 역필터를 적용해서 음원(voice source)을 효과적으로 구분해낸다. 본 논문에서 제안된 방법으로 음원을 추정할 경우, 변화가 심한 음원파형을 잘 추정할 수 있으며, 음원의 특성을 구분해낸 성도 파라미터도 효과적으로 추정할 수 있다. 본 연구내용은 음성합성에서 자연성 개선 및 개인성 구현을 위해서 필수적이며, 다양한 형태의 음성을 표현하기 위해 사용되어질 수 있다. 또한, 음성코딩, 화자인식, 음성인식에서도 사용되어질 수 있다.
PDF

단기간 기관내 삽관전, 후 음성지표의 측정

서영일;남순열
- Proceedings of the KOR-BRONCHOESO Conference
- /
- 1997.04a
- /
- pp.116-116
- /
- 1997
배경 및 목적: 전신마취를 위하여 시행한 기관내 삽관은 삽관튜브와 성대내면의 접촉에 의한 압력과 마찰로 후두 미세한 손상을 주게된다. 저자들은 단기간 기관내 삽관 전,후의 음성분석을 통하여 손상의 유무와 회복을 측정할 수 있는 객관적인 음성지표를 찾아보고자 하였다. 대상 및 방법: 만성 중이염 수술시 전신마취를 목적으로 경구기관 튜브를 거치한 성인 남자 10명과 여자 15명 환자를 대상으로 수술 1일전과 술후 24시간 후 각각 "a"음을 연장 발성시켜 CSL 4300B (KAY elemetrics Corp)의 MDVP(multidimensional voice program)을 이용하여 harmonic to noise ratio(NHR), Jitter, Shimmer, Fundamental frequency를 측정 비교하였다. 결 과: 남녀 모두에서 Jitter, Shimmer는 각각 평균 0.70%에서 1.06%, 1.92%에서2.28%로 증가되는 경향을 보였으나 통계학적 유의성은 없었다. Fundamental frequency는 여자에서 평균 220Hz에서 221Hz로 남자는 125Hz에서 128Hz로 변화를 보이지 않았고 harmonic to noise ratio(NHR)또한 평균 0.11로 수술 전, 후 변화를 관찰할 수 없었다. 결 론: 이상의 결과로 2내지 6시간의 단기간 삽관으로 인한 성대의 손상은 경미하여 24시간 이내에 회복되는 것으로 판단된다. 향후 6시간 이상의 기관내삽관이나 수일이상의 장기간 삽관후의 음성지표의 측정등의 연구가 필요할 것으로 사료된다.
PDF

Search Result 1,373, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)