• Title/Summary/Keyword: 음성 특성

Search Result 1,834, Processing Time 0.026 seconds

Speech Database for 3-5 years old Korean Children (만 3-5세 유아의 한국어 음성 데이터베이스 구축)

  • Yoo, Jae-Kwon;Lee, Kyung-Ok;Lee, Kyoung-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.4
    • /
    • pp.52-59
    • /
    • 2012
  • Children develop their language skill rapidly between age 3 and 5. To meet the child's language development through a variety of experiences, it is necessary to develop age-appropriate contents. So it needs to develop various contents using speech interface for children, but there is no speech database of korean children. In this paper, we develop speech database of 3 to 5 years old children in korean. For collecting accurate children's speech, child education experts examine in the speech database development process. The words for database are selected from MCDI-K in two stage and children speak a word three times. Such collected speech are tokenized by child and word and stored in database. This speech database will be transferred through web and, hopefully, be the foundation of development of children-oriented contents.

Adaptive Noise Canceller by Weight Updating Control Method for Speech Enhancement (음성향상을 위한 가중치 갱신제어방식의 적응소음제거기)

  • Kim, Gyu-Dong;Lee, Yun-Jung;Kim, Pil-Un;Chang, Yong-Min;Cho, Jin-Ho;Kim, Myoung-Nam
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.8
    • /
    • pp.1004-1016
    • /
    • 2007
  • In this paper we proposed a Weight-Update-Control Adaptive Noise Canceller which improves speech when environmental noise is stationary and it is hard to acquire a reference signal. Adaptive Noise Canceller(ANC) needs a reference signal, but it is not easy to measure pure noise without voice for reference in factory. Because there are mixed various mechanical noise and workers' voice. Therefore ANC is not suitable to reduce background noise. So we proposed the method that uses an arbitrary constant as an input signal and inputs microphone signal to the reference signal. The noise is eliminated using updated weights in non-speech range. In speech range the weight is fixed and the modified voice is acquired then voice is restored through transversal filter. The proposed method is based on facts that the factory noise is stationary and the noise is not changed in short conversation range. As a result of simulation using MATLAB, we confirmed that the proposed method is effective for reducing factory noise and has high signal to noise ratio(SNR).

  • PDF

Carving deleted voice data in mobile (삭제된 휴대폰 음성 데이터 복원 방법론)

  • Kim, Sang-Dae;Byun, Keun-Duck;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.1
    • /
    • pp.57-65
    • /
    • 2012
  • People leave voicemails or record phone conversations in their daily cell phone use. Sometimes important voice data is deleted by the user accidently, or purposely to cover up criminal activity. In these cases, deleted voice data must be able to be recovered for forensics, since the voice data can be used as evidence in a criminal case. Because cell phones store data that is easily fragmented in flash memory, voice data recovery is very difficult. However, if there are identifiable patterns for the deleted voice data, we can recover a significant amount of it by researching images of it. There are several types of voice data, such as QCP, AMR, MP4, etc.. This study researches the data recovery solutions for EVRC codec and AMR codec in QCP file, Qualcumm's voice data format in cell phone.

The Effect of Perceived Anthropomorphic Characteristics on Continuous Usage Intention of Artificial Intelligence Voice Speaker : Based on the Integrated Adoption Model (인공지능 음성 스피커의 의인화 특성 지각 정도가 지속적 이용 의향에 미치는 영향: 통합 수용 모델을 기반으로)

  • Lee, Sungjoon
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.41-55
    • /
    • 2021
  • AI voice speaker has played an important role in forming an early market and development for AI-based goods and service with growing attention from many people. In this context, this research examined factors affecting continuous intention of AI voice speaker based on the integrated adoption model, which combined two factors of perceived playfulness and innovation resistance with extended technology acceptance model. It was also examined whether three perceived anthropomorphic features(i.e., perceived rational support, perceived intimacy, perceived cognitive openness) have influences on continuous intention of AI voice speaker. The data was collected by an online-survey and were responses of those who are in their 20s and 30s and have experienced in using AI voice speaker. They were analyzed by using SEM(Structural Equation Modeling). The results showed that all of perceived ease of use, perceived usefulness, perceived playfulness and innovation resistance had significant influences on continuous intention of AI voice speaker. In addition, all of perceived rational support, perceived intimacy and perceived cognitive openness as perceived anthropomorphic features had significant influences on perceived ease of use, perceived usefulness and perceived playfulness. The implications of found results in this research was also discussed.

Modified HMM Decoder based on Observation Confidence for Speaker Identification (화자인식을 위한 관측신뢰도 기반 변형된 HMM 디코더)

  • Tariquzzaman, Md.;Min, So-Hui;Kim, Jin-Yeong;Na, Seung-Yu
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.443-446
    • /
    • 2007
  • 음성신호는 잡음 또는 전송 채널의 특성에 의하여 왜곡되고, 왜곡된 음성은 음성인식 및 화자인식의 성능을 크게 저하시킨다. 이러한 문제점을 극복하기 위해 본 논문에서는 Gaussian mixture model (GMM)에 적용된 신호대잡음비 (SNR)기반 신뢰도 가중 기법[1][2]을 Hidden Markov model(HMM) 디코더에 변형하여 적용하였다. HMM 디코더 변형은 HMM 상태별 관측확률을 논문 [1]에서 제시된 신뢰도로 가중함으로써 이루어졌다. 제안한 방법의 성능을 확인하기 위해 ETRI에서 만든 한국어 화자인식용 휴대폰 음성 DB를 사용하여 문맥종속 화자식별 실험을 하였다. 실험결과 기존 방법에 비해 제안한 방법의 화자인식률이 크게 향상됨을 확인 할 수 있었다.

  • PDF

Prevalence of Voice Disorders and Characteristics of Korean Voice Handicap Index in the Elderly (노인 음성장애 출현율 및 음성장애지수 특성)

  • Song, Yun-Kyung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.151-159
    • /
    • 2012
  • The purpose of this study is to evaluate the prevalence of voice disorders and the Korean voice handicap index in the elderly. For this study, 169 elderly performed two types of questionnaires and vowel /a/ prolongation. Self-reported voice symptoms and the Korean voice handicap index were analyzed and acoustic voice evaluation was performed by MDVP. The results showed that the prevalence of voice disorders in the elderly are significantly higher than that of adults in self-reports. In acoustic evaluation, 32.2% of the male elderly and 40.9% of the female elderly exceeded the thresholds of Jitter (%), Shimmer (%) and NHR. In addition, Korean voice handicap index scores of the female elderly are significantly higher than those of female adults. These findings indicate the high frequency of voice disorders in the elderly and the need to focus on this group. Additional studies on the voice related quality of life for the elderly are needed.

Voice Source Estimation Using Robust Sequential SVD (견실 순차 특이치분해를 이용한 음원추정)

  • 홍성훈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.75-79
    • /
    • 1993
  • 본 논문에서는 변화가 심한 음원파형을 추정하는 새로운 순차처리 알고리듬을 제안한다. 먼저, 1) 기존의 순차처리 분석법중 대표적인 분석법인 RLS(recursive least square)의 문제점들을 검토하고, 2) 이를 개선하기 위해서 관측행렬(observation matrix)을 최적차수의 SVD(reduced-rank singular value decomposition)로 재구성하고, 3) 이에 견실개념(robustness concept)을 적용해서 최적의 성도변수(vocal tract parameter)를 찾아내고 역필터를 적용해서 음원(voice source)을 효과적으로 구분해낸다. 본 논문에서 제안된 방법으로 음원을 추정할 경우, 변화가 심한 음원파형을 잘 추정할 수 있으며, 음원의 특성을 구분해낸 성도 파라미터도 효과적으로 추정할 수 있다. 본 연구내용은 음성합성에서 자연성 개선 및 개인성 구현을 위해서 필수적이며, 다양한 형태의 음성을 표현하기 위해 사용되어질 수 있다. 또한, 음성코딩, 화자인식, 음성인식에서도 사용되어질 수 있다.

  • PDF

An Implementation of Car Number Retrieving System with Speech Recognition (음성인식을 이용한 차량 번호 조회 시스템의 구현)

  • Yoon Chul-Joong;Yoon Jeh-Seon;Hong Kwang-Seok
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.127-130
    • /
    • 2000
  • 음성인식 기술은 사용자의 편리성을 제공하는 인터페이스로 많이 환용이 된다. 또한 음성이라는 특성상 새로이 기계를 작동하는 방법을 익히지 않아도 되며 빠르게 정보를 전달 할 수 있다. 본 논문에서는 음성인식 기술을 차량 번호를 조회하는 단말기에 적용하였다. 이것은 기존의 단말기보다 입력이 간편하여 사용자에게 편리함을 제공한다. 또한 잦은 오류를 피할 수 있으며 오류가 발생했다 하더라도 쉽게 수정할 수 있는 기능을 제공한다

  • PDF

Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features (정상 음성의 목소리 특성의 정성적 분류와 음성 특징과의 상관관계 도출)

  • Kim, Jungin;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.71-76
    • /
    • 2014
  • In this paper voice quality of normal speech is qualitatively classified by five components of breathy, creaky, rough, nasal, and thin/thick voice. To determine whether a correlation exists between a subjective measure of voice and an objective measure of voice, each voice is perceptually evaluated using the 1/2/3 scale by speech processing specialists and acoustically analyzed using speech analysis tools such as the Praat, MDVP, and VoiceSauce. The speech parameters include features related to speech source and vocal tract filter. Statistical analysis uses a two-independent-samples non-parametric test. Experimental results show that statistical analysis identified a significant correlation between the speech feature parameters and the components of voice quality.

Analyzing the element of emotion recognition from speech (음성으로부터 감성인식 요소 분석)

  • 박창현;심재윤;이동욱;심귀보
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.199-202
    • /
    • 2001
  • 일반적으로 음성신호로부터 사람의 감정을 인식할 수 있는 요소는 (1)대화의 내용에 사용한 단어, (2)톤 (Tone), (3)음성신호의 피치(Pitch), (4)포만트 주파수(Formant Frequency), 그리고 (5)말의 빠르기(Speech Speed) (6)음질(Voice Quality) 등이다. 사람의 경우는 주파수 같은 분석요소 보다는 론과 단어, 빠르기, 음질로 감정을 받아들이게 되는 것이 자연스러운 방법이므로 당연히 후자의 요소들이 감정을 분류하는데 중요한 인자로 쓰일 수 있다. 그리고, 종래는 주로 후자의 요소들을 이용하였는데, 기계로써 구현하기 위해서는 조금 더 공학적인 포만트 주파수를 사용할 수 있게 되는 것이 도움이 된다. 그러므로, 본 연구는 음성 신호로부터 피치와 포만트, 그리고 말의 빠르기 등을 이용하여 감성 인식시스템을 구현하는 것을 목표로 연구를 진행하고 있는데, 그 1단계 연구로서 본 논문에서는 화가 나서 내뱉는 알과 기쁠 때 간단하게 사용하는 말들을 기반으로 하여 극단적인 두 가지 감정의 독특한 특성을 찾아낸다.

  • PDF