통합 검색 | Korea Science

선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환 (Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method)

권홍석;배건성
- 한국음향학회지
- /
- 제20권3호
- /
- pp.15-23
- /
- 2001
본 논문에서는 임의의 사람이 발성한 음성을 마치 다른 사람이 발성한 것처럼 들리도록 하는 음성변환 기술에 대하여 설명하고, 화자간의 성도 특성과 여기신호 특성 파라미터 변환을 독립적으로 수행하기 위한 변환방법을 실험한다. 성도 특성 파라미터 변환은 입력되는 음성신호에서 LPC (Linear Predictive Cofficient)켑스트럼을 추출하여 선형다변회귀모델에 적용하여 수행하고, 여기신호 특성 파라미터 변환은 잔차신호를 추출하여 LP-PSOLA (Linear Predictive-Pitch Synchronous Overlap and Add) 합성방식을 이용한 화자간의 평균 피치주기 변환으로 수행된다. 실험결과는 선형다변회귀모델과 LP-PSOLA 합성방식을 이용하여 변환된 음성이 대상화자의 음성에 유사함을 보여준다
PDF

성대결절의 위치와 발성 방법과의 관계 (The Relationship between The Voicing Method and Vocal Fold Nodule located in Different levels)

안철민;문고정;정덕희
- 대한후두음성언어의학회지
- /
- 제13권1호
- /
- pp.33-39
- /
- 2002
Background and Objectives : The vocal fold nodules which were made by excessive contact or vibration of the vocal folds were classified to the soft nodule and the hard nodule in according to the hardness or the duration of nodule. Sometimes laryngologist saw the nodule to be located in different level. Authors thought that each nodule to be located in different level might have the different causes. Therefore we studied to know the relationship between the voicing technique and each vocal fold nodule to be located in different level. Materials and Methods : One-hundred forty nine patients who had the vocal fold nodule were evaluated. Sites and shapes of the vocal fold nodules were investigated using videostroboscopy. Videokymography was also used to scan the center of the vocal fold nodules during phonation and classified to several types. Same procedures were done on normal subject while he simulated the various types of voicing. And we compared the findings between both of them. Three different types of lesion can be distinguished. These are ML group that lesions were located from mid to low, MH group that lesions were located from mid to upper and HL group that lesions were located from lower to upper of the vocal folds. Results : The VKG findings of ML group and situation simulating with hard glottal attack and vocal fry were similar. MH group had a similar VKG findings with situation simulating with whispering or high pitch voicing. HL group had a similar VKG findings with situation simulating with loud voicing. Conclusions : Authors thought that each vocal fold nodule, which had different shapes and located in different level, related with the different types of voicing.
PDF

피치 반감 배가를 유발하는 병적인 음성 분석을 위한 강인한 피치 검출 알고리즘 (Robust Pitch Detection Algorithm for Pathological Voice inducing Pitch Halving and Doubling)

장승진;최성희;김효민;최홍식;윤영로
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2007년도 제38회 하계학술대회
- /
- pp.1797-1798
- /
- 2007
In field of voice pathology, diverse statistics extracted form pitch estimation were commonly used to assess voice quality. In this study, we proposed robust pitch detection algorithm which can estimate pitch of pathological voices in benign vocal fold lesions. we also compared our proposed algorithm with three established pitch detection algorithms; autocorrelation, simplified inverse filtering technique, and nonlinear state-space embedding methods. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices. According to the results of pitch errors, gross pitch error showed some increases in cases of pathological voices; especially excessive increase in PDA based on nonlinear time-series. In an analysis of types of pathological voices classified by aperiodicity and the degree of chaos, the more voice has aperiodic and chaotic, the more growth of pitch errors increased. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.
PDF

Pitch trajectories of English vowels produced by American men, women, and children

Yang, Byunggon
- 말소리와 음성과학
- /
- 제10권4호
- /
- pp.31-37
- /
- 2018
Pitch trajectories reflect a continuous variation of vocal fold movements over time. This study examined the pitch trajectories of English vowels produced by 139 American English speakers, statistically analyzing their trajectories using the Generalized Additive Mixed Models (GAMMs). First, Praat was used to read the sound data of Hillenbrand et al. (1995). A pitch analysis script was then prepared, and six pitch values at the corresponding time points within each vowel segment were collected and checked. The results showed that the group of men produced the lowest pitch trajectories, followed by the groups of women, boys, then girls. The density line showed a bimodal distribution. The pitch values at the six corresponding time points formed a single dip, which changed gradually across the vowel segment from 204 to 193 to 196 Hz. The normality tests performed on the pitch data rejected the null hypothesis. Nonparametric tests were therefore conducted to discover the significant differences in the values among the four groups. The GAMMs, which analyzed all the pitch data, produced significant results among the pitch values at the six corresponding time points but not between the two groups of boys and girls. The GAMMs also revealed that the two groups were significantly different only at the first and second time points. Accordingly, the methodology of this study and its findings may be applicable to future studies comparing curvilinear data sets elicited by experimental conditions.
https://doi.org/10.13064/KSSS.2018.10.4.031 인용 PDF KSCI

음도 고정 시 강도 변화에 따른 일반인과 성악인 발성의 성대접촉률 변화 특성의 비교 (The Changes in the Closed Qutient of Trained Singers and Untrained Controls Under Varying Intensity at a Constant Vocal Pitch)

김한수;전용선;정성민;조근경;박은희
- 대한후두음성언어의학회지
- /
- 제16권1호
- /
- pp.28-32
- /
- 2005
Background and Objectives : The most important two factors of the voice production are the respiratory function which is the power source of voice and the glottic closure that transform the air flow into sound signals. The purpose of this study was to investigate the differences between trained singers and untrained controls under varying intensity at a constant vocal pitch by simulataneous using the airway interruption method and electroglottography(EGG). Materials and Methods : Under two different intensity condition at a constant vocal pitch(/G/), 20(Male 10, Female 10) trained singers were studied. Mean flow rate(MFR), subglottic pressure(Psub) and intensity were measured with aerodynamic test using the Phonatory function analyzer. Closed quotients(CQ), jitter and shimmer were also investigated by electroglottography using Lx speech studio. These data were compared with that of normal controls. Results : MFR and Psub were increased on high intensity condition in all subject groups but there was no statistically significance. Statistically significant increasing of CQ. were observed in male trained singers on high intensity condition (untrained male : 51.31${\pm}$3.70%, trained male :55.52${\pm}$6.07%, p=.039). Shimmer percent, one of the phonatory stability parameters, was also decreased statistically in all subject groups(p<.001). Conclusion : The trained singers' phonation was more efficient than untrained singers. The result means that the trained singers can increase the loudness with little changing of mean flow rate, subglottic pressure but more increasing of glottic closed quotients.
PDF

보컬 가창 훈련을 위한 CAI 개발 연구 (A Study on the CAI Development for Vocal Training in Applied Music)

문원경;이승연
- 한국HCI학회논문지
- /
- 제11권3호
- /
- pp.13-22
- /
- 2016
실용음악의 가창 및 악기에 대한 도제식 교육방식은 국내에 실용음악 교육이 도입된 이후부터 지금까지 큰 변화 없이 수용되어 왔다. '1대1 개인 레슨'이나 배정된 '전공지도 교수자에 의한 도제식 교육법' 이외의 다른 교수법에 대한 논의나 제안이 제기된 사례가 많지 않다. 1980년대 후반, 실용음악 교육이 국내에 소개된 이후 지금까지 실용음악 교육을 위한 CAI(Computer Aided Instruction) 코스웨어 개발은 실용음악 이외의 분야에서처럼 활발하게 이루어지지 않았다. 물론, 실용음악 분야에서도 컴퓨터를 활용한 음악 프로듀싱이나 영상 음악 분야의 비약적인 발전이 있었다. 하지만 발전된 컴퓨터 프로그램이 실용음악 교육에 적극적으로 적용되지 않고 있는 상황이다. 본 연구에서는 음악 제작 소프트웨어의 발전된 기능들을 활용하여 실용음악 가창 분야에서 전통적인 도제식 교육을 개선하기 위한 학습 방법에 대해 연구하고자 한다. 특히 본 논문에서는 음원의 음정 보정을 위해 개발된 피치 쉬프트(Pitch Shift)기술인 오토 튠(auto tune)을 활용한다. 이것을 통해 음정에 관한 실시간 피드백이나 녹음 후 모니터를 시각적으로 제공하여 가창 훈련 시 음정 정확성 향상을 유도할 수 있는 학습 방법을 제시하고자한다. 물론 가창력을 판단할 때 음정 정확도만 평가 되는 것은 아니며 음정 정확도 역시 발성, 발음 등의 복잡한 신체 능력에 영향을 받는다. 하지만 이 연구로 컴퓨터를 활용하는 교육이 실용음악 보컬 학습자들에게 시간적, 공간적 제한을 극복하여 더 효율적인 가창 훈련을 할 수 있는 방법 중 하나를 제시 할 수 있을 것으로 기대한다.
PDF KSCI

성악가를 위한 VAT 음성치료 개발 및 적용 사례연구 (A Case Study on Vocal Aerobic Treatment Voice Therapy Development and Application for Classical Singers)

유재연;이하나
- 재활복지
- /
- 제22권1호
- /
- pp.157-168
- /
- 2018
본 연구는 반폐쇄성도훈련에 기반을 둔 성대에어로빅치료(Vocal Aerobic Treatment: VAT)가 소프라노 성악가 음성개선에 미치는 영향을 알아보고자 하였다. 연구대상은 성대결절로 음성문제를 호소하는 소프라노 성악가 1명으로 연구를 진행하였다. 연구방법은 치료 전 후 음향학적평가와 주관적 음성평가를 실시하여 측정값을 비교하였으며, 성대에어로빅치료는 주 2회 총 32회기동안 진행하였다. 음향학적 평가는 MDVP (multi-dimensional voice program)와 VRP (voice range profile)를 사용하여 음도, 음질, 음역을 평가하였으며, 주관적 음성평가는 SVHI (singing voice handicap index)로 주관적인 음성만족도를 평가하였다. 음도 평가 결과 치료 후 소프라노 가수에 적절한 기본주파수(Fo)를 유지하였다. 음질평가 결과 주파수변동률(Jitter), 진폭변동률(Shimmer), 배음대소음비(NHR) 수치가 치료 전보다 감소하였다. 음역평가결과 음역의 범위가 넓어졌으며 반음의 개수가 30개에서 35개로 증가하였다. 주관적 음성평가는 설문 보고 후 획득한 총 점수를 문항수로 나눈 결과 3.6점에서 0.6점으로 감소하였으며, 본인이 느끼는 음성문제의 정도가 경미하다고 보고하였다. 이러한 결과를 정리해보면 성대에어로빅치료는 성악가 음성개선에 효과적인 것으로 사료된다. 그러나 본 연구는 소프라노 성악가 1명을 대상으로 한 성대에어로빅치료의 치료 효과에 관한 사례연구로 향후 더 많은 성악가를 대상으로 효과에 대한 연구가 이루어져야 할 것이다. 또한 성악가뿐만 아니라 다양한 직업적 음성사용자를 위한 음성관리 및 음성치료프로그램에 관한 후속 연구가 필요할 것이다.
https://doi.org/10.16884/JRR.2018.22.1.157 인용

음성 파형코딩의 음원피치 변경에 관한 연구 - LPC와 주기반분법에 의한 피치변경법 - (On Altering the Pitch of Speech Signals in Waveform Coding -(Altering Method by the LPC and the Pitch Halving)-)

민경중
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1991년도 학술발표회 논문집
- /
- pp.45-49
- /
- 1991
In area of the speech synthesis, the waveform coding with high quality are mainly used to the synthesis by analysis. However, it is difficult to applying the waveform coding to the synthesis by rule, because the parameters of this coding are not classified as either excitation parameters and vocal tract parameters. In this paper, we proposed a new pitch change method that can alter the pitch periods in the waveform coding. The proposed method expands the pitch period by the LPC synthesis method, and then the period is compressed by the waveform halving technique. Thus, it is possible that the waveform coding is carried out the synthesis by rule in speech processing.
PDF

피치 검출과 퍼지화 패턴을 이용한 숫자음 화자 인식에 관한 연구 (A Study on Number sounds Speaker recognition using the Pitch detection and the Fuzzified pattern)

김연숙;김희주;김경재
- 한국컴퓨터정보학회논문지
- /
- 제8권3호
- /
- pp.73-79
- /
- 2003
본 논문에서는 피치 검출과 퍼지화 패턴 매칭을 포함하는 화자 인식 알고리즘을 제안한다. 음의 개성을 표현하는 피치를 이용한 피치 패턴을 사용하고 음성의 파라미터는 2진화 스펙트럼을 사용한다. 비선형적인 발성 시간에 따른 시간 변동의 폭을 모두 포함할 수 있도록 음성 신호의 애매성을 보완할 수 있는 퍼지의 소속 함수를 이용하여 표준 패턴을 작성하고 퍼지화 패턴 매칭을 이용하여 인식을 수행한다.
PDF

Countertenor 1인의 Modal Register와 Falsetto Register에서의 공기역학적 변화 및 전기성문파형의 변화 연구 (Analysis of Phonatory Aerodynamic & Electroglottography of a Countertenor)

남도현;최성희;최재남;최홍식
- 대한후두음성언어의학회지
- /
- 제17권1호
- /
- pp.43-48
- /
- 2006
Background and Objectives: Countertenors who can produce higher vocal pitch like female classical singer's voice and use both modal and falsetto register. This study was conducted to study phonatory characteristics between modal and falsetto register of the countertenor. Materials and Methods: A male countertenor who had 8 years of experience was examined using a videostroboscopy and his voice was analyzed using aerodynamic measures; fundamental frequency(F0), Mean air flow rate(MFR), intensity(SLP), subglottal air pressure(Psub) with phonatory function analyzer(Nagashima) and acoustic measures; jitter, shimmer, HNR, closed quotient(CQ) using a Electro-glottography(EGG) of Lx. Speech Studio(Laryngoscope, Ltd, UK) and voice range profile of CSL(Kay elemetrics). Results: In the stroboscopy finding, the longitudinal length of vocal folds was increased at the falsetto register and the upper margin of vocal folds vibrated with incomplete closure of true vocal folds. In aerodynamic analysis, intensity was same at the modal and falsetto register. However, MFR, Psub, MPT were higher at the falsetto register. In the electroglottographic analysis, closed quotient(CQ) at the modal register was high and also much higher at the high-pitch falsetto than at the loud falsetto. In the VRP, intensity was similar though F0 was different between modal and falsetto register. Conclusion: It implied that countertenor could produce powerful voice quality by increasing of respiratory pressure and respiratory volume though glottal closure was incomplete. In addition, no change of EGG waveform, similar voice range with alto was observed.
PDF

Search Result 145, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)