• 제목/요약/키워드: Phonetic Distance

검색결과 42건 처리시간 0.032초

명령어간 거리를 최대화하는 음성 게임 명령어의 선택 방법 (A Method for Selecting Voice Game Commands to Maximize the Command Distance)

  • 김상철
    • 한국게임학회 논문지
    • /
    • 제19권4호
    • /
    • pp.97-108
    • /
    • 2019
  • 최근 입력 방식의 다양성이나 편리성 때문에 음성 게임 명령어에 대한 관심이 증가하고 있다. 음성 명령어의 인식률은 인식 엔진의 성능뿐만이 아니라, 명령어간의 거리에도 영향을 받는다. 명령어간 거리란 명령어 발음간의 음성적 차이를 말하는데, 이 거리가 클수록 인식률이 높아진다. 본 논문에서 우리는 명령별 명령어 후보들이 주어졌을 때 명령어간의 평균 거리를 최대화하는 명령어 조합을 선택하는 문제를 IP(Integer Programming)으로 모델링한다. 또한 명령어 선택 문제의 해를 구하는 SA(Simulated Annealing) 기반의 방법을 제안한다. 우리의 방법을 명령어 수, 허용되는 명령어 길이 등의 다양한 조건에 하에서 실험한 결과를 토대로 특징을 분석한다.

Secure Blocking + Secure Matching = Secure Record Linkage

  • Karakasidis, Alexandros;Verykios, Vassilios S.
    • Journal of Computing Science and Engineering
    • /
    • 제5권3호
    • /
    • pp.223-235
    • /
    • 2011
  • Performing approximate data matching has always been an intriguing problem for both industry and academia. This task becomes even more challenging when the requirement of data privacy rises. In this paper, we propose a novel technique to address the problem of efficient privacy-preserving approximate record linkage. The secure framework we propose consists of two basic components. First, we utilize a secure blocking component based on phonetic algorithms statistically enhanced to improve security. Second, we use a secure matching component where actual approximate matching is performed using a novel private approach of the Levenshtein Distance algorithm. Our goal is to combine the speed of private blocking with the increased accuracy of approximate secure matching.

음절말 자음 중화의 원인 (Why do Obstruents Neutralize in Syllable Final Position\ulcorner)

  • 양순임
    • 대한음성학회지:말소리
    • /
    • 제41호
    • /
    • pp.31-47
    • /
    • 2001
  • The purpose of this study is to explain the cause of obsturents neutralization in syllable final position. Most of the previous phonological studies did not reflect phonetic reality sufficiently because of the limited use of the binary feature system. Using binary distinctive features, we can't explain the cause of neutralization. In order to explain the cause of neutralization, I use the multi-valued phonetic feature -[vocal tract aperture]. By [vocal tract aperture] I mean the distance between articulators in the hold stage. In this study, I claim that the cause of neutralization is assimilation to [vocal tract aperture] 0 degree. The neutralized sounds become aplosives, as a consequence of assimilation to [vocal tract aperture].

  • PDF

모음 상승 현상의 음성적 고찰: 어미 {-고}의 실현을 중심으로 (A Phonetic Study of Vowel Raising: A Closer Look at the Realization of the Suffix {-go})

  • 이향원;신지영
    • 한국어학
    • /
    • 제81권
    • /
    • pp.267-297
    • /
    • 2018
  • Vowel raising in Korean has been primarily treated as a phonological, categorical change. This study aims to show how the Korean connective suffix {-go} is realized in various environments, and propose a principle of vowel raising based on both acoustic and perceptual data. To that end, we used a corpus of spoken Korean to analyze the types of syntactic constructions, the realization of prosodic boundaries (IP and PP), and the types of boundary tone associated with {-go}. It was found that the vowel tends to be raised most frequently in utterance-final position, while in utterance-medial position the vowel was raised more when the syntactic and prosodic distance between {-go} and the following constituent was smaller. The results for boundary tone also showed a correlation between vowel raising and the discourse function of the boundary tone. In conclusion, we propose that vowel raising is not simply an optional phenomenon, but rather a type of phonetic reduction related to the comprehension of the following constituent.

켑스트럼 거리 기반의 음성/음악 판별 성능 향상 (Performance Improvement of Speech/Music Discrimination Based on Cepstral Distance)

  • 박슬한;최무열;김형순
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.195-206
    • /
    • 2005
  • Discrimination between speech and music is important in many multimedia applications. In this paper, focusing on the spectral change characteristics of speech and music, we propose a new method of speech/music discrimination based on cepstral distance. Instead of using cepstral distance between the frames with fixed interval, the minimum of cepstral distances among neighbor frames is employed to increase discriminability between fast changing music and speech. And, to prevent misclassification of speech segments including short pause into music, short pause segments are excluded from computing cepstral distance. The experimental results show that proposed method yields the error rate reduction of$68\%$, in comparison with the conventional approach using cepstral distance.

  • PDF

가상현실 환경에서의 3차원 사운드 생성을 위한 거리 변화에 따른 구조적 머리전달함수 모델 (A Range Dependent Structural HRTF Model for 3-D Sound Generation in Virtual Environments)

  • 이영한;김홍국
    • 대한음성학회지:말소리
    • /
    • 제59호
    • /
    • pp.89-99
    • /
    • 2006
  • This paper proposes a new structural head-related transfer function(HRTF) model to produce sounds in a virtual environment. The proposed HRTF model generates 3-D sounds by using a head model, a pinna model and the proposed distance model for azimuth, elevation, and distance that are three aspects for 3-D sounds, respectively. In particular, the proposed distance model consists of level normalization block distal region model, and proximal region model. To evaluate the performance of the proposed model, we setup an experimental procedure that each listener identifies a distance of 3-D sound sources that are generated by the proposed method with a predefined distance. It is shown from the tests that the proposed model provides an average distance error of $0.13{\sim}0.31$ meter when the sound source is generated as if it is 0.5 meter $\sim$ 2 meters apart from the listeners. This result is comparable to the average distance error of the human listening for the actual sound source.

  • PDF

Phonetic Posterior Grams에 의해 조건화된 적대적 생성 신경망을 사용한 음성 변환 시스템 (Voice Conversion using Generative Adversarial Nets conditioned by Phonetic Posterior Grams)

  • 임진수;강천성;김동하;김경섭
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2018년도 추계학술대회
    • /
    • pp.369-372
    • /
    • 2018
  • 본 논문은 매핑 되지 않은 입력 음성과 목표음성 사이에 음성 변환하는 비 병렬 음성 변환 네트워크를 제안한다. 기존 음성 변환 연구에서는 변환 전후 스펙트로그램의 거리 오차를 최소화하는 방법을 주로 학습 한다. 이러한 방법은 MSE의 이미지를 평균 내는 특징으로 인하여 생성된 스펙트로그램의 해상도가 저하되는 문제점이 있었다. 또한, 병렬 데이터를 사용해 연구를 진행했기 때문에 데이터를 수집하는 것에도 어려움이 많았다. 본 논문에서는 입력 음성의 발음 PPGs를 사용하여 비 병렬 데이터 간 학습을 진행 하며, GAN 학습을 통해 더욱 선명한 음성을 생성하는 방법을 사용하였다. 제안한 방법의 유효성을 검증하기 위해서 기존 음성 변환 시스템에서 많이 사용하는 GMM 기반 모델과 MOS 테스트를 진행하였으며 기존 모델에 비하여 성능이 향상되는 결과를 얻었다.

  • PDF

마이크로폰의 종류 및 설치거리에 따른 음성인식성능변화의 검토 (The Validation of Speech Recognition Performance Change according to the kind and established distance of the Microphone)

  • 김연화;이광현;최대림;김봉완;이용주
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.141-143
    • /
    • 2003
  • Speech recognition performance depends on various factors. One of the factors is the characteristic and established distance of a microphone which is used when speech data is collected. Thus, in the present experiment speech databases for tests are created through the type and established distance of a microphone. Then, acoustic models are built based on these databases, and each of the acoustic models is assessed by the data to determine recognition performance depending on various microphones and established microphone distances.

  • PDF

이웃 정보에 기초한 반모델을 이용한 발화 검증 (Utterance Verification Using Anti-models Based on Neighborhood Information)

  • 윤영선
    • 대한음성학회지:말소리
    • /
    • 제67호
    • /
    • pp.79-102
    • /
    • 2008
  • In this paper, we investigate the relation between Bayes factor and likelihood ratio test (LRT) approaches and apply the neighborhood information of Bayes factor to building an alternate hypothesis model of the LRT system. To consider the neighborhood approaches, we contemplate a distance measure between models and algorithms to be applied. We also evaluate several methods to improve performance of utterance verification using neighborhood information. Among these methods, the system which adopts anti-models built by collecting mixtures of neighborhood models obtains maximum error rate reduction of 17% compared to the baseline, linear and weighted combination of neighborhood models.

  • PDF

음성학을 토대로 한 자음군 습득 모형 (Phonetically Based Consonant Cluster Acquisition Model)

  • 권보영
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.109-113
    • /
    • 2007
  • Second language learners' variable degree of production difficulty according to the cluster type has previously been accounted for in terms of sonority distance between adjacent segments. As an alternative to this previous model, I propose a Phonetically Based Consonant Cluster Acquisition Model (PCCAM) in which consonant cluster markedness is defined based on the articulatory and perceptual factors associated with each consonant sequence. The validity of PCCAM has been tested through Korean speakers' production of English consonant clusters.

  • PDF