• 제목/요약/키워드: Speech de-identification

검색결과 3건 처리시간 0.015초

음성 비식별화 모델과 방송 음성 변조의 한국어 음성 비식별화 성능 비교 (Comparison of Korean Speech De-identification Performance of Speech De-identification Model and Broadcast Voice Modulation)

  • 김승민;박대얼;최대선
    • 스마트미디어저널
    • /
    • 제12권2호
    • /
    • pp.56-65
    • /
    • 2023
  • 뉴스와 취재 프로그램 같은 방송에서는 제보자의 신원 보호를 위해 음성을 변조한다. 음성 변조 방법으로 피치(pitch)를 조절하는 방법이 가장 많이 사용되는데, 이 방법은 피치를 재조절하는 방식으로 쉽게 원본 음성과 유사하게 음성 복원이 가능하다. 따라서 방송 음성 변조 방법은 화자의 신원 보호를 제대로 해줄 수 없고 보안상 취약하기 때문에 이를 대체하기 위한 새로운 음성 변조 방법이 필요하다. 본 논문에서는 Voice Privacy Challenge에서 비식별화 성능이 검증된 Lightweight 음성 비식별화 모델을 성능 비교 모델로 사용하여 피치 조절을 사용한 방송 음성변조 방법과 음성 비식별화 성능 비교 실험 및 평가를 진행한다. Lightweight 음성 비식별화 모델의 6가지 변조 방법 중 비식별화 성능이 좋은 3가지 변조 방법 McAdams, Resampling, Vocal Tract Length Normalization(VTLN)을 사용하였으며 한국어 음성에 대한 비식별화 성능을 비교하기 위해 휴먼 테스트와 EER(Equal Error Rate) 테스트를 진행하였다. 실험 결과로 휴먼 테스트와 EER 테스트 모두 VTLN 변조 방법이 방송 변조보다 더 높은 비식별화 성능을 보였다. 결과적으로 한국어 음성에 대해 Lightweight 모델의 변조 방법은 충분한 비식별화 성능을 가지고 있으며 보안상 취약한 방송 음성 변조를 대체할 수 있을 것이다.

Perception of Spanish $/{\setminus}/$ - /r/ distinction by native Japanese

  • Mignelina Guirao Jorge A. Gurlekian;Maria A. Garcia Jurado
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.337-342
    • /
    • 1996
  • In prevoius works we have repored phonetic similarities between Japanese and Spanish voweis and syiiabic sounds. (1) (2) (3) (4). In the present communication we explore the relative importance of duration of the consonantal segment to elicit Spanish /l/ - /r/ distinction by native j Japanese talkers. Three Argentine and three trained native Japanese talkers recorded /l-r/ combined with /a/ in VCV sequences. Modifications of consonant duration and vowel context with transitions were m made by editing natural /ala/ sounds. Mixed VCV were produced by combining sounds of both languages. Perceptual tests were produced by combining sounds of both languages perceptual performed presenting the speech material, to native t trained and non trained Japanese listeners. In a tirst sessIOn a d discrimination procedure was applied. The items were arranged in pairs a and listeners Nere told to indicate the pair that sounded different. In the f following session they were asked to identify and type the letter corresponding to each one of the items. Responses arc examined in tenns of critical duration of the interval between vowels. Preliminary results indicate that the duration of intervocalic intervais was a relevant cue for the identification of /l/ and /r/. It seems that to differentiate the two sounds, Japanese listeners required relatively longer interval steps than the argentine suhjects. There was a tendency to conhlse more frequently /l/ for /r/ than viceversa.

  • PDF

The identification of /I/ in Spanish and French

  • Jorge A. Gurlekian;Benoit Jacques;Miguelina Guirao
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.521-528
    • /
    • 1996
  • This presentation explores on the perceptual characteristics of the lateral sound /l/ in CV syllables. At initial position we found that /l/ has well marked formant transitions. Then several questions arise: 1) are these formant structures dependent on the following vowel\ulcorner. 2) Are the formant transitions giving an additional cue for the identification\ulcorner Considering that the French vocalic system presents a greater variety of vowels than Spanish, several experiments were designed to verify to what extent a more extensive range of vocalic timbres contribute to the perception of /l/. Natural emissions of /l/ produced in Argentine Spanish and Canadian French CV syllables were recorded, where V was successively /i, e, a, o, u/ for Spanish and /i, e, $\varepsilon$, a, $\alpha$, o, u, y, \phi$/ for French. For each item, the segment C was maintained and V was replaced by cutting & splicing by each of the remaining vowels without transitions. Results of the identification tests for Spanish show that natural /l/ segments with low Fl and high formants F3, F4 can be clearly identified in the /i, e, u/ vowel contexts without transitions. For French subjects the combination of /l/ with a vowel without transitions reflected correct identifications for its own original vowel context in /e, $\varepsilon$, y, $\phi$/. For both languages, in all these combinations, F1 values remained rather steady along the syllable. In the case of /o, u/ very likely the F2 difference lead to a variety of perceptions of the original /l/. For example in Ilul, French subjects reported some identifications of /l/ as a vowel, mainly /y/. Our observations reinforce the importance of F1 as a relevant cue for /l/, and the incidence of the relative distance between formants frequencies of both components.

  • PDF