• Title/Summary/Keyword: Voice Feature

Search Result 232, Processing Time 0.03 seconds

Correlativity Analysis of Voice Waveform and Feature Parameter According to a Favorable Impression Research Result (호감도 조사 결과에 따른 음성 파형 및 특징 요소와의 상관성 분석)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Ka, Min-Kyoung;Cho, Dong-Uk;J.Bae, Young-Lae
    • Annual Conference of KIPS
    • /
    • 2009.11a
    • /
    • pp.365-366
    • /
    • 2009
  • 목소리는 사람의 정서, 인격, 성격 및 기타 많은 개인적인 요소를 나타내고 있다. 즉, 사람의 목소리를 통해 자신이 보유하고 있는 모든 내면적, 외면적 정보를 알 수 있는 중요한 부분이다. 따라서 본 논문에서는 매력적이면서 좋은 목소리가 보유하고 있는 음성 분석학적 특징 요소를 추출하여 호감도 좋은 목소리와의 상관성을 분석하고자 한다. 이를 위해 청각적으로 다른 특징을 보유하고 있는 5가지 타입의 남성 및 여성의 목소리를 수집하고 임의의 대상자들을 통해 호감도 좋은 목소리를 선정하였다. 또한 피치, 강도 및 스펙트로그램 분석 요소를 적용하여 호감도 좋은 목소리가 보유하고 있는 음성 정보값을 추출하고 상호간의 연관성을 분석하였다.

Analytical Voice Feature Values Extraction of Heart Sound Based on Donuibogam (동의보감에 근거한 심장 소리의 음성 분석학적 특징값 추출)

  • Minkyoung Ka;Bonghyun Kim;Sehwan Lee;jihyun Kwak;Dong-Uk Cho
    • Annual Conference of KIPS
    • /
    • 2008.11a
    • /
    • pp.125-128
    • /
    • 2008
  • 현대사회에서 건강을 해치는 요인으로 흡연, 당뇨, 비만 및 스트레스 등이 있다. 이와 같은 요인들로 순환기질환의 발병이 증가하고 있으며, 특히 심장 질환 사망률이 점차 증가하고 있는 실정이다. 이를 해결하기 위해 본 논문에서는 심장 질환에 대한 조기 진단을 위한 음성 분석학적 특징 요소를 분석하여 결과값을 추출하고자 한다. 이를 위해 본 논문에서는 대전 지역에 거주하고 있는 성인 남성중에서 심장 질환을 앓고 있는 환자들과 심장에 이상이 없는 정상인들로 피실험자 집단을 구성하고 이들의 음성을 수집하여 음성 분석학적 특징 요소들을 추출하고자 한다. 특히 동의보감에서 제시한 심장의 소리를 음성 공학적으로 입증하기 위해 제 5 포먼트와 지터 등의 출력값을 비교, 분석하고자 한다.

Study of Developing SOP for Extracting Stable Vocal Features for Accurate Diagnosis (음성의 안정적 변수 추출을 위한 SOP 개발 연구)

  • Kim, Keun-Ho;Jang, Jun-Su;Kim, Young-Su;Kim, Jong-Yeol
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.25 no.6
    • /
    • pp.1108-1112
    • /
    • 2011
  • Voice can be widely used to classify the four constitution types and to recognize one's health condition from extracting meaningful features as physical quantity in traditional Korean medicine or Western medicine. In this paper, we proposed the method to update the standard operating procedure (SOP) to acquire and record voices for extracting stable vocal features since they are sensitive to the variation of a subject's utterance. At first, we obtained pitch frequencies from vowels and the sentence and intensity form the sentence as features with voices acquired under subjects' utterance conditions and then the deviation ratios of features from median values according to the utterance conditions were obtained and the condition to minimize the ratio was selected as a new SOP. As a result, we decided the SOP for a subject to utter vowels with the length of 2s~1s and sentences with over 2s interval between them after practice, in consideration of the deviation and qualitative requirements. Stable voice features obtained from updated SOP produce accurate diagnosis, which will be developed and simplified for using in the u-Healthcare system of personalized medicine.

Speaker Identification Using Dynamic Time Warping Algorithm (동적 시간 신축 알고리즘을 이용한 화자 식별)

  • Jeong, Seung-Do
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.5
    • /
    • pp.2402-2409
    • /
    • 2011
  • The voice has distinguishable acoustic properties of speaker as well as transmitting information. The speaker recognition is the method to figures out who speaks the words through acoustic differences between speakers. The speaker recognition is roughly divided two kinds of categories: speaker verification and identification. The speaker verification is the method which verifies speaker himself based on only one's voice. Otherwise, the speaker identification is the method to find speaker by searching most similar model in the database previously consisted of multiple subordinate sentences. This paper composes feature vector from extracting MFCC coefficients and uses the dynamic time warping algorithm to compare the similarity between features. In order to describe common characteristic based on phonological features of spoken words, two subordinate sentences for each speaker are used as the training data. Thus, it is possible to identify the speaker who didn't say the same word which is previously stored in the database.

Implementation of Embedded Speech Recognition System for Supporting Voice Commander to Control an Audio and a Video on Telematics Terminals (텔레메틱스 단말기 내의 오디오/비디오 명령처리를 위한 임베디드용 음성인식 시스템의 구현)

  • Kwon, Oh-Il;Lee, Heung-Kyu
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.11
    • /
    • pp.93-100
    • /
    • 2005
  • In this paper, we implement the embedded speech recognition system to support various application services such as audio and video control using speech recognition interface on cars. The embedded speech recognition system is implemented and ported in a DSP board. Because MIC type and speech codecs affect the accuracy of speech recognition. And also, we optimize the simulation and test environment to effectively remove the real noises on a car. We applied a noise suppression and feature compensation algorithm to increase an accuracy of sppech recognition on a car. And we used a context dependent tied-mixture acoustic modeling. The performance evaluation showed high accuracy of proposed system in office environment and even real car environment.

An Efficient Dynamic Bandwidth Allocation Algorithm for VoDSL Services (VoDSL 서비스를 위한 효율적인 동적 대역폭 할당 알고리즘)

  • Kim, Hoon;Park, Jong-Dae;Nam, Sang-Sig;Park, Kwang-Chae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.1C
    • /
    • pp.48-58
    • /
    • 2002
  • As internet traffic increases, the problem that it should be efficiently accepted in to the existing voice network in the pending problem importantly to the existing communication corporations. The feature of next generation exchange network is made up of the form of integration network that connect data traffic(internet service. Etc) with the existing voice network and it can be showed with very diverse aspects according to the constitution time of network or the characteristics of business. The progress strategy that develop the existing circuit based communication network into packet-based communication network can be divided into two in a large scale according to the application position These are VoDSL application method(Technology that packetize access network) and softswitch technology application method(after packetizing relay network, packetizing that into the access network). In this paper, we will deduce the desirable technology that can construct packet-based next generation exchange networks in the structure of the existing communication network environment. We will perform the research on a device to offer the necessary core technique VoSDL service with realistic resolutions primarily.

Electromyographic evidence for a gestural-overlap analysis of vowel devoicing in Korean

  • Jun, Sun-A;Beckman, M.;Niimi, Seiji;Tiede, Mark
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.153-200
    • /
    • 1997
  • In languages such as Japanese, it is very common to observe that short peripheral vowel are completely voiceless when surrounded by voiceless consonants. This phenomenon has been known as Montreal French, Shanghai Chinese, Greek, and Korean. Traditionally this phenomenon has been described as a phonological rule that either categorically deletes the vowel or changes the [+voice] feature of the vowel to [-voice]. This analysis was supported by Sawashima (1971) and Hirose (1971)'s observation that there are two distinct EMG patterns for voiced and devoiced vowel in Japanese. Close examination of the phonetic evidence based on acoustic data, however, shows that these phonological characterizations are not tenable (Jun & Beckman 1993, 1994). In this paper, we examined the vowel devoicing phenomenon in Korean using data from ENG fiberscopic and acoustic recorders of 100 sentences produced by one Korean speaker. The results show that there is variability in the 'degree of devoicing' in both acoustic and EMG signals, and in the patterns of glottal closing and opening across different devoiced tokens. There seems to be no categorical difference between devoiced and voiced tokens, for either EMG activity events or glottal patterns. All of these observations support the notion that vowel devoicing in Korean can not be described as the result of the application of a phonological rule. Rather, devoicing seems to be a highly variable 'phonetic' process, a more or less subtle variation in the specification of such phonetic metrics as degree and timing of glottal opening, or of associated subglottal pressure or intra-oral airflow associated with concurrent tone and stricture specifications. Some of token-pair comparisons are amenable to an explanation in terms of gestural overlap and undershoot. However, the effect of gestural timing on vocal fold state seems to be a highly nonlinear function of the interaction among specifications for the relative timing of glottal adduction and abduction gestures, of the amplitudes of the overlapped gestures, of aerodynamic conditions created by concurrent oral tonal gestures, and so on. In summary, to understand devoicing, it will be necessary to examine its effect on phonetic representation of events in many parts of the vocal tracts, and at many stages of the speech chain between the motor intent and the acoustic signal that reaches the hearer's ear.

  • PDF

A Study on Infant Respiratory Diseases Diagnosis using Frequency Bandwidth Analysis of Crying Waveform (울음소리의 주파수 대역폭 분석을 이용한 소아호흡기 질환 진단에 관한 연구)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.12B
    • /
    • pp.1123-1130
    • /
    • 2008
  • Baby's diseases diagnosis has inconvenient for received direct coming to help that order expression ability was insufficiency which consciousness situation concern about the infant health because of birth rate and decrease the marriage rate and divorce rate. So in this paper through the infant crying sound about home a foundation which infant diseases develop the system comparison normal infant with take a infant that analysis the extract the voice analytics component. Especially this paper propose about the methodology for development system that infant cold, infant pneumonia, infant asthma among extract the crying sound feature part for infant respiratory diseases discussion the most easy has involved the infant. So infant respiratory put case stimulus diseases about all voice organs and experiment the analysis method through the bandwidth about phonetics analysis component that comparison normal infant with take a respiratory infant. Through these method, we were extracted to results that infant's frequency bandwidth suffering from respiratory diseases than a normal infant is short.

Implementation of Real-time Vowel Recognition Mouse based on Smartphone (스마트폰 기반의 실시간 모음 인식 마우스 구현)

  • Jang, Taeung;Kim, Hyeonyong;Kim, Byeongman;Chung, Hae
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.531-536
    • /
    • 2015
  • The speech recognition is an active research area in the human computer interface (HCI). The objective of this study is to control digital devices with voices. In addition, the mouse is used as a computer peripheral tool which is widely used and provided in graphical user interface (GUI) computing environments. In this paper, we propose a method of controlling the mouse with the real-time speech recognition function of a smartphone. The processing steps include extracting the core voice signal after receiving a proper length voice input with real time, to perform the quantization by using the learned code book after feature extracting with mel frequency cepstral coefficient (MFCC), and to finally recognize the corresponding vowel using hidden markov model (HMM). In addition a virtual mouse is operated by mapping each vowel to the mouse command. Finally, we show the various mouse operations on the desktop PC display with the implemented smartphone application.

Survey on Out-Of-Domain Detection for Dialog Systems (대화시스템 미지원 도메인 검출에 관한 조사)

  • Jeong, Young-Seob;Kim, Young-Min
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.9
    • /
    • pp.1-12
    • /
    • 2019
  • A dialog system becomes a new way of communication between human and computer. The dialog system takes human voice as an input, and gives a proper response in voice or perform an action. Although there are several well-known products of dialog system (e.g., Amazon Echo, Naver Wave), they commonly suffer from a problem of out-of-domain utterances. If it poorly detects out-of-domain utterances, then it will significantly harm the user satisfactory. There have been some studies aimed at solving this problem, but it is still necessary to study about this intensively. In this paper, we give an overview of the previous studies of out-of-domain detection in terms of three point of view: dataset, feature, and method. As there were relatively smaller studies of this topic due to the lack of datasets, we believe that the most important next research step is to construct and share a large dataset for dialog system, and thereafter try state-of-the-art techniques upon the dataset.