통합 검색 | Korea Science

Kernel PCA를 이용한 GMM 기반의 음성변환 (GMM Based Voice Conversion Using Kernel PCA)

한준희;배재현;오영환
- 대한음성학회지:말소리
- /
- 제67호
- /
- pp.167-180
- /
- 2008
This paper describes a novel spectral envelope conversion method based on Gaussian mixture model (GMM). The core of this paper is rearranging source feature vectors in input space to the transformed feature vectors in feature space for the better modeling of GMM of source and target features. The quality of statistical modeling is dependent on the distribution and the dimension of data. The proposed method transforms both of the distribution and dimension of data and gives us the chance to model the same data with different configuration. Because the converted feature vectors should be on the input space, only source feature vectors are rearranged in the feature space and target feature vectors remain unchanged for the joint pdf of source and target features using KPCA. The experimental result shows that the proposed method outperforms the conventional GMM-based conversion method in various training environment.
PDF

머신러닝 기법을 이용한 한국어 보이스피싱 텍스트 분류 성능 분석 (Korean Voice Phishing Text Classification Performance Analysis Using Machine Learning Techniques)

무사부부수구밀란두키스;진상윤;장대호;박동주
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2021년도 추계학술발표대회
- /
- pp.297-299
- /
- 2021
Text classification is one of the popular tasks in Natural Language Processing (NLP) used to classify text or document applications such as sentiment analysis and email filtering. Nowadays, state-of-the-art (SOTA) Machine Learning (ML) and Deep Learning (DL) algorithms are the core engine used to perform these classification tasks with high accuracy, and they show satisfying results. This paper conducts a benchmarking performance's analysis of multiple SOTA algorithms on the first known labeled Korean voice phishing dataset called KorCCVi. Experimental results reveal performed on a test set of 366 samples reveal which algorithm performs the best considering the training time and metrics such as accuracy and F1 score.
https://doi.org/10.3745/PKIPS.y2021m11a.297 인용 PDF

가변어휘 단어 인식기를 사용한 음성 명령 웹 브라우저 (Voice Command Web Browser Using Variable Vocabulary Word Recognizer)

이항섭
- 한국음향학회지
- /
- 제18권2호
- /
- pp.48-52
- /
- 1999
본 논문에서는 웹 브라우저 상에서 한국어 음성인식을 이용하여 정보검색을 할 수 있는 가변어휘 단어 인식기를 사용한 음성 명령 웹 브라우저에 대하여 기술한다. 이 시스템의 특징은 웹 브라우저 상에서 보여지는 링크를 가지는 HyperText Word들과 웹 브라우저 메뉴를 음성으로 인식할 수 있는 것으로, 마우스 click 뿐만이 아니라 음성인식을 이용하여서도 웹 브라우저를 사용할 수 있다는 것이다. 웹 브라우저를 통해서 보여지는 문서에서 추출되는 인식 후보들은 각 문서에 따라 고정되지 않고 계속하여 변화하므로, 이러한 가변적인 인식 후보들을 인식하기 위해 가변어휘 단어 인식기를 사용하였다. 가변어휘 단어 인식기는 훈련용 음성 데이터와 무관한 임의의 새로운 어휘를 훈련 없이 인식해 낼 수 있는 인식기로 POW (Phonetically Optimized Words) 3,848 단어를 사용하여 훈련한 결과 32단어에 대해 93.8%의 단어 인식률을 보인다. 음성 명령 웹 브라우저는 Windows 95/NT 환경에서 Netscape Navigator를 사용하여 개발되었으며, 사용자가 음성을 사용하는 새로운 인터페이스를 배울 필요 없이 바로 사용할 수 있도록 사용자 편의성 부분도 고려하여 개발되었다. 개발된 음성 명령 웹 브라우저는 환경 독립, 화자 독립에 대해 On-line으로 실험한 결과 평균 90%의 인식성능을 보인다.
PDF

머신러닝 기반 음성분석을 통한 체질량지수 분류 예측 - 한국 성인을 중심으로 (Application of Machine Learning on Voice Signals to Classify Body Mass Index - Based on Korean Adults in the Korean Medicine Data Center)

김준호;박기현;김호석;이시우;김상혁
- 사상체질의학회지
- /
- 제33권4호
- /
- pp.1-9
- /
- 2021
Objectives The purpose of this study was to check whether the classification of the individual's Body Mass Index (BMI) could be predicted by analyzing the voice data constructed at the Korean medicine data center (KDC) using machine learning. Methods In this study, we proposed a convolutional neural network (CNN)-based BMI classification model. The subjects of this study were Korean adults who had completed voice recording and BMI measurement in 2006-2015 among the data established at the Korean Medicine Data Center. Among them, 2,825 data were used for training to build the model, and 566 data were used to assess the performance of the model. As an input feature of CNN, Mel-frequency cepstral coefficient (MFCC) extracted from vowel utterances was used. A model was constructed to predict a total of four groups according to gender and BMI criteria: overweight male, normal male, overweight female, and normal female. Results & Conclusions Performance evaluation was conducted using F1-score and Accuracy. As a result of the prediction for four groups, The average accuracy was 0.6016, and the average F1-score was 0.5922. Although it showed good performance in gender discrimination, it is judged that performance improvement through follow-up studies is necessary for distinguishing BMI within gender. As research on deep learning is active, performance improvement is expected through future research.
https://doi.org/10.7730/JSCM.2021.33.4.1 인용 PDF KSCI

증강 현실 기반 음식점 서빙 상황훈련 시스템 (A Situational Training System for the food serving in the restaurant based on the Argumented Reality)

정광일;김성진;김부년;김태영;임철수
- 한국게임학회 논문지
- /
- 제9권1호
- /
- pp.135-142
- /
- 2009
최근 정보통신 기술과 복지수준이 발전함에 따라 장애인들을 위한 여러 인터페이스 장비나 훈련 시스템들이 연구되고 있으나 발달 장애인을 위한 훈련시스템은 미흡한 실정이다. 본 논문 에서는 발달 장애인의 상황 대처 능력 향상과 사회 적응훈련을 목적으로 '음식점 서빙' 이라는 주제에 따른 증강 현실기반 상황 훈련 시스템을 제안한다. 본 시스템은 실제 훈련환경에 마커를 두고 실제 공간에서는 사용하기 어려운 물체들을 가상 물체로 대체한 증강 현실 공간을 구성하여 안전하고 주변 환경의 제약없이 반복훈련이 가능하도록 한다. 훈련자는 HMD를 착용하고 훈련 공간 주위를 볼 수 있으며, 음성 멘트에 따라 행동을 함으로써 시나리오에 따른 훈련이 가능하게 된다.
PDF

Recognition of the Korean alphabet Using Neural Oscillator Phase model Synchronization

Kwon, Yong-Bum;Lee, Jun-Tak
- 한국지능시스템학회:학술대회논문집
- /
- 한국퍼지및지능시스템학회 2003년도 ISIS 2003
- /
- pp.315-317
- /
- 2003
Neural oscillator is applied in oscillatory systems (Analysis of image information, Voice recognition. Etc...). If we apply established EBPA(Error back Propagation Algorithm) to oscillatory system, we are difficult to presume complicated input's patterns. Therefore, it requires more data at training, and approximation of convergent speed is difficult. In this paper, I studied the neural oscillator as synchronized states with appropriate phase relation between neurons and recognized the Korean alphabet using Neural Oscillator Phase model Synchronization.
PDF

선천성 청각장애성인의 시각적피드백 이용 음도치료 효과 (The Effect of Visual Feedback Intervention on Voice Pitch of Adult with Hearing Impairment)

어수지;윤미선
- 음성과학
- /
- 제12권4호
- /
- pp.215-226
- /
- 2005
This study is an attempt to investigate effect of pitch treatment program using visual feedback for profound deaf adults. Dr. Speech program was applied as a training tool. The subjects of this study were 3 profound deaf adults. Speech samples for evaluation were vowel prolongations and connected speech. Analysis was performed under the principle of single subject research design. As results of this study, all subjects showed the treatment effects which were represented by lowering fundamental frequency and speaking fundamental frequency.
PDF

보건관련학과의 생물테러교육 필요성에 대한 조사 및 교육현황 (A Proposal on the Development of Bioterrorism education for Public health personnel)

김지희
- 한국방재학회:학술대회논문집
- /
- 한국방재학회 2008년도 정기총회 및 학술발표대회
- /
- pp.393-394
- /
- 2008
Recently keeping pace with globalization, many international conferences and athletic games are being held in Korea. After 911 terror in New York in 2001, Korean government dispatched Zaytun Division in Iraq and this fact has also led to voice concerns that Korea should be prepared to protect from biological terrors as soon as possible. It is important to develop the bioterrorism emergency medical training for public health students including paramedic in Korea. So I propose the development of bioterrorism education curriculum.
PDF

강인한 핵심어 인식을 위해 유용한 주파수 대역을 이용한 음성 검출기 (Accurate Speech Detection based on Sub-band Selection for Robust Keyword Recognition)

지미경;김회린
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2002년도 11월 학술대회지
- /
- pp.183-186
- /
- 2002
The speech detection is one of the important problems in real-time speech recognition. The accurate detection of speech boundaries is crucial to the performance of speech recognizer. In this paper, we propose a speech detector based on Mel-band selection through training. In order to show the excellence of the proposed algorithm, we compare it with a conventional one, so called, EPD-VAA (EndPoint Detector based on Voice Activity Detection). The proposed speech detector is trained in order to better extract keyword speech than other speech. EPD-VAA usually works well in high SNR but it doesn't work well any more in low SNR. But the proposed algorithm pre-selects useful bands through keyword training and decides the speech boundary according to the energy level of the sub-bands that is previously selected. The experimental result shows that the proposed algorithm outperforms the EPD-VAA.
PDF

청각장애자용 발음훈련기기 개발에 관한 연구 (A study on speech training aids for Deafs)

안상필;이재혁;윤태성;박상희
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1990년도 하계학술대회 논문집
- /
- pp.47-50
- /
- 1990
Deafs cannot speak straight voice as normal people in lack of feedback of their pronunciation, therefore speech training is required. In this study, fundamental frequency, intensity, formant frequencies, vocal tract graphic and vocal tract area function, extracted from speech signal, are used as feature parameter. AR model, whose coefficients are extracted using inverse filtering. is used as speech generation model. In connect ion between vocal tract graphic and speech parameter, articulation distances and articulation distance functions in selected 15-intervals are determined by extracted vocal tract areas and formant frequencies.
PDF

검색결과 177건 처리시간 0.022초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)