통합 검색 | Korea Science

고속 주행중인 자동차 환경에서의 음성인식 연구 (A Study of Speech Recognition in a High Speed Automobile)

유봉근
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
- /
- pp.65-69
- /
- 1998
고속 주행중인 자동차 환경에서 운전자의 안전 및 편의성을 위하여, 음성인식 기술을 이용한 각종 차량 편의장치를 제어하는 것으로, 운전자와 자동차와의 Man Machine Interface 구조로 구성되었다. 이 시스템은 주행중인 자동차 환경에서 보조적인 스위치의 조작없이 상시 음성의 입, 출력이 가능하도록 하며, band pass filter를 이용하여 잡음 환경에 강인한 모델을 선택하도록 하였으며, 음성의 특징 파라미터와 인식 알고리즘은 perceptual linear predictive 13차와 one-stage dynamic programming을 사용하였다. off-line 실험 결과 고속 주행중인 자동차 환경에서 자주 사용하는 차량제어 명령 33개에 대하여 화자독립 82.47%(중부고속도로), 화자종속 94.44%의 인식율을 구하였다. 또한 고속 주행중인 차량에서 kvhs, 핸드폰 사용으로 인한 사고를 줄이기 위하여 음성으로 전화를 걸 수 있도록 하는 Voice Dialing기능도 구현하였다.
PDF

후두질환 음성의 자동 식별 성능 비교 (Performance Comparison of Automatic Detection of Laryngeal Diseases by Voice)

강현민;김수미;김유신;김형순;조철우;양병곤;왕수건
- 대한음성학회지:말소리
- /
- 제45호
- /
- pp.35-45
- /
- 2003
Laryngeal diseases cause significant changes in the quality of speech production. Automatic detection of laryngeal diseases by voice is attractive because of its nonintrusive nature. In this paper, we apply speech recognition techniques to detection of laryngeal cancer, and investigate which feature parameters and classification methods are appropriate for this purpose. Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are examined as feature parameters, and parameters reflecting the periodicity of speech and its perturbation are also considered. As for classifier, multilayer perceptron neural networks and Gaussian Mixture Models (GMM) are employed. According to our experiments, higher order LPCC with the periodic information parameters yields the best performance.
PDF

IC 설계용 집적형 캐드 시스템의 구현 (An Implementation of integrated CAD system of IC design)

공진흥;김성중;김재협
- 전자공학회논문지A
- /
- 제30A권1호
- /
- pp.73-85
- /
- 1993
This paper presents a design and implementation of CAD(Computer-Aided Design) system with tools and design environments for IC(Intergrated Circuits)design. The CAD system can be easily installed in various sites with limited resources, since most CAD tools and design environments are available in the public-domain and Unix & X Window-based PC-386 and Workstation is used for the hardware platform. In order to improve the flexibility of the CAD system, objects are defined in the context of tools and environments` and object tables are programmed to describe the integration of CAD tools and design environments. During the execution, tool-objects deal with intertool communication and round-robin mechanism to incrementally control the execution of CAD tools. The IC design of LPC(Linear Predictive Coding) Speech Synthesizer is carried out to find out improvements and bugs of the CAD system.
PDF

An Implementatin of a Multi-Channel Speech Surveillance System Over Telephone Lines

Kim, Sung-Soo
- The Journal of the Acoustical Society of Korea
- /
- 제17권4E호
- /
- pp.17-21
- /
- 1998
This paper presents an implementation of a multi-channel speech surveillance system over telephone lines using TMS320C31 DSP chips. The incoming speech into each telephone line are first compressed simultaneously in real-time by the popular vector-sum excited linear predictive (VSELP) speech coding algorithm at the rate of 8 Kbps. The compressed steech bit streams are then multiplexed with those of other users. The multiplexed speech bit streams are transferred to the system storage equipments with some other required information so that a system operator can later monitor the stored speech data whenever it is necessary. The host program runs under Microsoft Windows95 for an efficient man-machine interface and a future upgrade-ability. We have confirmed that the overall 64-channel system operates satisfactorily in realtime. We also have checked approximately up to 2,880 total hours of recording capability of the system on a playback module and two removable backup drives.
PDF

Hidden LMS 적응 필터링 알고리즘을 이용한 경쟁학습 화자검증 (Speaker Verification Using Hidden LMS Adaptive Filtering Algorithm and Competitive Learning Neural Network)

조성원;김재민
- 대한전기학회논문지:시스템및제어부문D
- /
- 제51권2호
- /
- pp.69-77
- /
- 2002
Speaker verification can be classified in two categories, text-dependent speaker verification and text-independent speaker verification. In this paper, we discuss text-dependent speaker verification. Text-dependent speaker verification system determines whether the sound characteristics of the speaker are equal to those of the specific person or not. In this paper we obtain the speaker data using a sound card in various noisy conditions, apply a new Hidden LMS (Least Mean Square) adaptive algorithm to it, and extract LPC (Linear Predictive Coding)-cepstrum coefficients as feature vectors. Finally, we use a competitive learning neural network for speaker verification. The proposed hidden LMS adaptive filter using a neural network reduces noise and enhances features in various noisy conditions. We construct a separate neural network for each speaker, which makes it unnecessary to train the whole network for a new added speaker and makes the system expansion easy. We experimentally prove that the proposed method improves the speaker verification performance.
PDF KSCI

LTJ 적응필터의 실용적 구현과 적응반향제거기에 대한 적용 (A Practical Implementation of the LTJ Adaptive Filter and Its Application to the Adaptive Echo Canceller)

유재하
- 음성과학
- /
- 제11권2호
- /
- pp.227-235
- /
- 2004
In this paper, we proposed a new practical implementation method of the lattice transversal joint (LTJ) adaptive filter using speech codec's information. And it was applied to the adaptive echo cancellation problem to verify the efficiency of the proposed method. Realtime implementation of the LTJ adaptive filter is very difficult due to high computational complexity for the filter coefficients compensation. However, in case of using speech codec, complexity can be reduced since linear predictive coding (LPC) coefficients are updated each frame or sub-frame instead of every sample. Furthermore, LPC coefficients can be acquired from speech decoder and transformed to the reflection coefficients. Therefore, the computational complexity for updates of the reflection coefficients can be reduced. The effectiveness of the proposed LTJ adaptive filter was verified by the experiments about convergence and tracking performance of the adaptive echo canceller.
PDF

TBE 모델을 사용하는 HMM 기반 음성합성기 성능 향상을 위한 하모닉 선택에 기반한 MVF 예측 방법 (Harmonic Peak Picking-based MVF Estimation for Improvement of HMM-based Speech Synthesis System Using TBE Model)

박지훈;한민수
- 말소리와 음성과학
- /
- 제4권4호
- /
- pp.79-86
- /
- 2012
In the two-band excitation (TBE) model, maximum voiced frequency (MVF) is the most important feature of the excitation parameter because the synthetic speech quality depends on MVF. Thus, this paper proposes an enhanced MVF estimation scheme based on the peak picking method. In the proposed scheme, the local peak and the peak lobe are picked from the spectrum of a linear predictive residual signal. The normalized distance between neighboring peak lobes is calculated and utilized as a feature to estimate MVF. Experimental results of both objective and subjective tests show that the proposed scheme improves synthetic speech quality compared with that of the conventional one.
https://doi.org/10.13064/KSSS.2012.4.4.079 인용 PDF

간략화된 형상학적 다항식 변환과 형상학적 보간을 이용한 배설형 예측 방법 (Non-linear Predictive Method using Simplified Morphological Polynomial Transform and Morphological Interpolation)

김수현;한헌수;홍민철;차형태
- 한국방송∙미디어공학회:학술대회논문집
- /
- 한국방송공학회 2002년도 정기총회 및 학술대회
- /
- pp.81-84
- /
- 2002
본 논문에서는 간략화 된 형상학적 다항식 변환(Morphological Polynomial Transform)과 형상학적 보간법(Morphological Interpolation)을 이용하는 비선형 예측 방법을 제안한다. 형상학적 다항식 변환은 형상학적 연산을 통해 데이터를 구조함수들의 계수들로 표현하는 변환이며, 형상학적 보간법은 형상학적 다항식 변환에 의한 계수들을 이용하여 보간하는 방법이다. 형상학적 다항식 변환을 간략화 하여 정수 연산만으로 적용할 수 있도록 개선하였으며, 보다 영상에 적합한 형상학적 보간법에 기반 한 예측 방법을 사용한다. 제안하는 예측 방법과 허프만 부호화를 사용하여 적은 비트로 영상을 손실 없이 저장할 수 있음을 실험으로 검증한다.
PDF

스팩트럼과 스팩트로그램의 이해 (Introduction to the Spectrum and Spectrogram)

진성민
- 대한후두음성언어의학회지
- /
- 제19권2호
- /
- pp.101-106
- /
- 2008
The speech signal has been put into a form suitable for storage and analysis by computer, several different operation can be performed. Filtering, sampling and quantization are the basic operation in digiting a speech signal. The waveform can be displayed, measured and even edited, and spectra can be computed using methods such as the Fast Fourier Transform (FFT), Linear predictive Coding (LPC), Cepstrum and filtering. The digitized signal also can be used to generate spectrograms. The spectrograph provide major advantages to the study of speech. So, author introduces the basic techniques for the acoustic recording, digital signal processing and the principles of spectrum and spectrogram.
PDF

초성파찰음의 음소분류에 관한 연구 (A Study on the Phonemic Segmentation of an Initial Affricate)

김기운;이기영;배철수;최갑석
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1988년도 전기.전자공학 학술대회 논문집
- /
- pp.33-36
- /
- 1988
In this paper, the starting point of affricate is detected from the first predictor coefficient of a 12-pole linear predictive coding (LPC) analysis and phonemic segmentation is done through measuring short time energy and zero crossing rate. By this segmentation method, the duration of an aspirate can be mearsured in order to detect an aspirate or not.
PDF

검색결과 508건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)