• Title/Summary/Keyword: Phonetic Distance

Search Result 42, Processing Time 0.029 seconds

A Method for Selecting Voice Game Commands to Maximize the Command Distance (명령어간 거리를 최대화하는 음성 게임 명령어의 선택 방법)

  • Kim, Sangchul
    • Journal of Korea Game Society
    • /
    • v.19 no.4
    • /
    • pp.97-108
    • /
    • 2019
  • Recently interests in voice game commands have been increasing due to the diversity and convenience of the input method, but also by the distance between commands. The command distance is the phonetic difference between command utterances, and as such distance increases, the recognition rate improves. In this paper, we propose an IP(Integer Programming) modeling of the problem which is to select a combination of commands from given candidate commands for maximizing the average distance. We also propose a SA(Simulated Annealing)-based algorithm for solving the problem. We analyze the characteristics of our method using experiments under various conditions such as the number of commands, allowable command length, and so on.

Secure Blocking + Secure Matching = Secure Record Linkage

  • Karakasidis, Alexandros;Verykios, Vassilios S.
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.223-235
    • /
    • 2011
  • Performing approximate data matching has always been an intriguing problem for both industry and academia. This task becomes even more challenging when the requirement of data privacy rises. In this paper, we propose a novel technique to address the problem of efficient privacy-preserving approximate record linkage. The secure framework we propose consists of two basic components. First, we utilize a secure blocking component based on phonetic algorithms statistically enhanced to improve security. Second, we use a secure matching component where actual approximate matching is performed using a novel private approach of the Levenshtein Distance algorithm. Our goal is to combine the speed of private blocking with the increased accuracy of approximate secure matching.

Why do Obstruents Neutralize in Syllable Final Position\ulcorner (음절말 자음 중화의 원인)

  • Yang Sun-Im
    • MALSORI
    • /
    • no.41
    • /
    • pp.31-47
    • /
    • 2001
  • The purpose of this study is to explain the cause of obsturents neutralization in syllable final position. Most of the previous phonological studies did not reflect phonetic reality sufficiently because of the limited use of the binary feature system. Using binary distinctive features, we can't explain the cause of neutralization. In order to explain the cause of neutralization, I use the multi-valued phonetic feature -[vocal tract aperture]. By [vocal tract aperture] I mean the distance between articulators in the hold stage. In this study, I claim that the cause of neutralization is assimilation to [vocal tract aperture] 0 degree. The neutralized sounds become aplosives, as a consequence of assimilation to [vocal tract aperture].

  • PDF

A Phonetic Study of Vowel Raising: A Closer Look at the Realization of the Suffix {-go} (모음 상승 현상의 음성적 고찰: 어미 {-고}의 실현을 중심으로)

  • LEE, HYANG WON;Shin, Jiyoung
    • Korean Linguistics
    • /
    • v.81
    • /
    • pp.267-297
    • /
    • 2018
  • Vowel raising in Korean has been primarily treated as a phonological, categorical change. This study aims to show how the Korean connective suffix {-go} is realized in various environments, and propose a principle of vowel raising based on both acoustic and perceptual data. To that end, we used a corpus of spoken Korean to analyze the types of syntactic constructions, the realization of prosodic boundaries (IP and PP), and the types of boundary tone associated with {-go}. It was found that the vowel tends to be raised most frequently in utterance-final position, while in utterance-medial position the vowel was raised more when the syntactic and prosodic distance between {-go} and the following constituent was smaller. The results for boundary tone also showed a correlation between vowel raising and the discourse function of the boundary tone. In conclusion, we propose that vowel raising is not simply an optional phenomenon, but rather a type of phonetic reduction related to the comprehension of the following constituent.

Performance Improvement of Speech/Music Discrimination Based on Cepstral Distance (켑스트럼 거리 기반의 음성/음악 판별 성능 향상)

  • Park Seul-Han;Choi Mu Yeol;Kim Hyung Soon
    • MALSORI
    • /
    • no.56
    • /
    • pp.195-206
    • /
    • 2005
  • Discrimination between speech and music is important in many multimedia applications. In this paper, focusing on the spectral change characteristics of speech and music, we propose a new method of speech/music discrimination based on cepstral distance. Instead of using cepstral distance between the frames with fixed interval, the minimum of cepstral distances among neighbor frames is employed to increase discriminability between fast changing music and speech. And, to prevent misclassification of speech segments including short pause into music, short pause segments are excluded from computing cepstral distance. The experimental results show that proposed method yields the error rate reduction of$68\%$, in comparison with the conventional approach using cepstral distance.

  • PDF

A Range Dependent Structural HRTF Model for 3-D Sound Generation in Virtual Environments (가상현실 환경에서의 3차원 사운드 생성을 위한 거리 변화에 따른 구조적 머리전달함수 모델)

  • Lee, Young-Han;Kim, Hong-Kook
    • MALSORI
    • /
    • no.59
    • /
    • pp.89-99
    • /
    • 2006
  • This paper proposes a new structural head-related transfer function(HRTF) model to produce sounds in a virtual environment. The proposed HRTF model generates 3-D sounds by using a head model, a pinna model and the proposed distance model for azimuth, elevation, and distance that are three aspects for 3-D sounds, respectively. In particular, the proposed distance model consists of level normalization block distal region model, and proximal region model. To evaluate the performance of the proposed model, we setup an experimental procedure that each listener identifies a distance of 3-D sound sources that are generated by the proposed method with a predefined distance. It is shown from the tests that the proposed model provides an average distance error of $0.13{\sim}0.31$ meter when the sound source is generated as if it is 0.5 meter $\sim$ 2 meters apart from the listeners. This result is comparable to the average distance error of the human listening for the actual sound source.

  • PDF

Voice Conversion using Generative Adversarial Nets conditioned by Phonetic Posterior Grams (Phonetic Posterior Grams에 의해 조건화된 적대적 생성 신경망을 사용한 음성 변환 시스템)

  • Lim, Jin-su;Kang, Cheon-seong;Kim, Dong-Ha;Kim, Kyung-sup
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.369-372
    • /
    • 2018
  • This paper suggests non-parallel-voice-conversion network conversing voice between unmapped voice pair as source voice and target voice. Conventional voice conversion researches used learning methods that minimize spectrogram's distance error. Not only these researches have some problem that is lost spectrogram resolution by methods averaging pixels. But also have used parallel data that is hard to collect. This research uses PPGs that is input voice's phonetic data and a GAN learning method to generate more clear voices. To evaluate the suggested method, we conduct MOS test with GMM based Model. We found that the performance is improved compared to the conventional methods.

  • PDF

The Validation of Speech Recognition Performance Change according to the kind and established distance of the Microphone (마이크로폰의 종류 및 설치거리에 따른 음성인식성능변화의 검토)

  • Kim Yoen-Whoa;Lee Kwang-Hyun;Choi Dae-Lim;Kim Bong-Wan;Lee Yong-Ju
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.141-143
    • /
    • 2003
  • Speech recognition performance depends on various factors. One of the factors is the characteristic and established distance of a microphone which is used when speech data is collected. Thus, in the present experiment speech databases for tests are created through the type and established distance of a microphone. Then, acoustic models are built based on these databases, and each of the acoustic models is assessed by the data to determine recognition performance depending on various microphones and established microphone distances.

  • PDF

Utterance Verification Using Anti-models Based on Neighborhood Information (이웃 정보에 기초한 반모델을 이용한 발화 검증)

  • Yun, Young-Sun
    • MALSORI
    • /
    • no.67
    • /
    • pp.79-102
    • /
    • 2008
  • In this paper, we investigate the relation between Bayes factor and likelihood ratio test (LRT) approaches and apply the neighborhood information of Bayes factor to building an alternate hypothesis model of the LRT system. To consider the neighborhood approaches, we contemplate a distance measure between models and algorithms to be applied. We also evaluate several methods to improve performance of utterance verification using neighborhood information. Among these methods, the system which adopts anti-models built by collecting mixtures of neighborhood models obtains maximum error rate reduction of 17% compared to the baseline, linear and weighted combination of neighborhood models.

  • PDF

Phonetically Based Consonant Cluster Acquisition Model (음성학을 토대로 한 자음군 습득 모형)

  • Kwon, Bo-Young
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.109-113
    • /
    • 2007
  • Second language learners' variable degree of production difficulty according to the cluster type has previously been accounted for in terms of sonority distance between adjacent segments. As an alternative to this previous model, I propose a Phonetically Based Consonant Cluster Acquisition Model (PCCAM) in which consonant cluster markedness is defined based on the articulatory and perceptual factors associated with each consonant sequence. The validity of PCCAM has been tested through Korean speakers' production of English consonant clusters.

  • PDF