Search | Korea Science

A Study of Speech Recognition in a High Speed Automobile (고속 주행중인 자동차 환경에서의 음성인식 연구)

유봉근
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.65-69
- /
- 1998
고속 주행중인 자동차 환경에서 운전자의 안전 및 편의성을 위하여, 음성인식 기술을 이용한 각종 차량 편의장치를 제어하는 것으로, 운전자와 자동차와의 Man Machine Interface 구조로 구성되었다. 이 시스템은 주행중인 자동차 환경에서 보조적인 스위치의 조작없이 상시 음성의 입, 출력이 가능하도록 하며, band pass filter를 이용하여 잡음 환경에 강인한 모델을 선택하도록 하였으며, 음성의 특징 파라미터와 인식 알고리즘은 perceptual linear predictive 13차와 one-stage dynamic programming을 사용하였다. off-line 실험 결과 고속 주행중인 자동차 환경에서 자주 사용하는 차량제어 명령 33개에 대하여 화자독립 82.47%(중부고속도로), 화자종속 94.44%의 인식율을 구하였다. 또한 고속 주행중인 차량에서 kvhs, 핸드폰 사용으로 인한 사고를 줄이기 위하여 음성으로 전화를 걸 수 있도록 하는 Voice Dialing기능도 구현하였다.
PDF

Performance Comparison of Automatic Detection of Laryngeal Diseases by Voice (후두질환 음성의 자동 식별 성능 비교)

Kang Hyun Min;Kim Soo Mi;Kim Yoo Shin;Kim Hyung Soon;Jo Cheol-Woo;Yang Byunggon;Wang Soo-Geun
- MALSORI
- /
- no.45
- /
- pp.35-45
- /
- 2003
Laryngeal diseases cause significant changes in the quality of speech production. Automatic detection of laryngeal diseases by voice is attractive because of its nonintrusive nature. In this paper, we apply speech recognition techniques to detection of laryngeal cancer, and investigate which feature parameters and classification methods are appropriate for this purpose. Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are examined as feature parameters, and parameters reflecting the periodicity of speech and its perturbation are also considered. As for classifier, multilayer perceptron neural networks and Gaussian Mixture Models (GMM) are employed. According to our experiments, higher order LPCC with the periodic information parameters yields the best performance.
PDF

An Implementation of integrated CAD system of IC design (IC 설계용 집적형 캐드 시스템의 구현)

공진흥;김성중;김재협
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.30A no.1
- /
- pp.73-85
- /
- 1993
This paper presents a design and implementation of CAD(Computer-Aided Design) system with tools and design environments for IC(Intergrated Circuits)design. The CAD system can be easily installed in various sites with limited resources, since most CAD tools and design environments are available in the public-domain and Unix & X Window-based PC-386 and Workstation is used for the hardware platform. In order to improve the flexibility of the CAD system, objects are defined in the context of tools and environments` and object tables are programmed to describe the integration of CAD tools and design environments. During the execution, tool-objects deal with intertool communication and round-robin mechanism to incrementally control the execution of CAD tools. The IC design of LPC(Linear Predictive Coding) Speech Synthesizer is carried out to find out improvements and bugs of the CAD system.
PDF

An Implementatin of a Multi-Channel Speech Surveillance System Over Telephone Lines

Kim, Sung-Soo
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.4E
- /
- pp.17-21
- /
- 1998
This paper presents an implementation of a multi-channel speech surveillance system over telephone lines using TMS320C31 DSP chips. The incoming speech into each telephone line are first compressed simultaneously in real-time by the popular vector-sum excited linear predictive (VSELP) speech coding algorithm at the rate of 8 Kbps. The compressed steech bit streams are then multiplexed with those of other users. The multiplexed speech bit streams are transferred to the system storage equipments with some other required information so that a system operator can later monitor the stored speech data whenever it is necessary. The host program runs under Microsoft Windows95 for an efficient man-machine interface and a future upgrade-ability. We have confirmed that the overall 64-channel system operates satisfactorily in realtime. We also have checked approximately up to 2,880 total hours of recording capability of the system on a playback module and two removable backup drives.
PDF

Speaker Verification Using Hidden LMS Adaptive Filtering Algorithm and Competitive Learning Neural Network (Hidden LMS 적응 필터링 알고리즘을 이용한 경쟁학습 화자검증)

Cho, Seong-Won;Kim, Jae-Min
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.51 no.2
- /
- pp.69-77
- /
- 2002
Speaker verification can be classified in two categories, text-dependent speaker verification and text-independent speaker verification. In this paper, we discuss text-dependent speaker verification. Text-dependent speaker verification system determines whether the sound characteristics of the speaker are equal to those of the specific person or not. In this paper we obtain the speaker data using a sound card in various noisy conditions, apply a new Hidden LMS (Least Mean Square) adaptive algorithm to it, and extract LPC (Linear Predictive Coding)-cepstrum coefficients as feature vectors. Finally, we use a competitive learning neural network for speaker verification. The proposed hidden LMS adaptive filter using a neural network reduces noise and enhances features in various noisy conditions. We construct a separate neural network for each speaker, which makes it unnecessary to train the whole network for a new added speaker and makes the system expansion easy. We experimentally prove that the proposed method improves the speaker verification performance.
PDF KSCI

A Practical Implementation of the LTJ Adaptive Filter and Its Application to the Adaptive Echo Canceller (LTJ 적응필터의 실용적 구현과 적응반향제거기에 대한 적용)

Yoo, Jae-Ha
- Speech Sciences
- /
- v.11 no.2
- /
- pp.227-235
- /
- 2004
In this paper, we proposed a new practical implementation method of the lattice transversal joint (LTJ) adaptive filter using speech codec's information. And it was applied to the adaptive echo cancellation problem to verify the efficiency of the proposed method. Realtime implementation of the LTJ adaptive filter is very difficult due to high computational complexity for the filter coefficients compensation. However, in case of using speech codec, complexity can be reduced since linear predictive coding (LPC) coefficients are updated each frame or sub-frame instead of every sample. Furthermore, LPC coefficients can be acquired from speech decoder and transformed to the reflection coefficients. Therefore, the computational complexity for updates of the reflection coefficients can be reduced. The effectiveness of the proposed LTJ adaptive filter was verified by the experiments about convergence and tracking performance of the adaptive echo canceller.
PDF

Harmonic Peak Picking-based MVF Estimation for Improvement of HMM-based Speech Synthesis System Using TBE Model (TBE 모델을 사용하는 HMM 기반 음성합성기 성능 향상을 위한 하모닉 선택에 기반한 MVF 예측 방법)

Park, Jihoon;Hahn, Minsoo
- Phonetics and Speech Sciences
- /
- v.4 no.4
- /
- pp.79-86
- /
- 2012
In the two-band excitation (TBE) model, maximum voiced frequency (MVF) is the most important feature of the excitation parameter because the synthetic speech quality depends on MVF. Thus, this paper proposes an enhanced MVF estimation scheme based on the peak picking method. In the proposed scheme, the local peak and the peak lobe are picked from the spectrum of a linear predictive residual signal. The normalized distance between neighboring peak lobes is calculated and utilized as a feature to estimate MVF. Experimental results of both objective and subjective tests show that the proposed scheme improves synthetic speech quality compared with that of the conventional one.
https://doi.org/10.13064/KSSS.2012.4.4.079 인용 PDF

Non-linear Predictive Method using Simplified Morphological Polynomial Transform and Morphological Interpolation (간략화된 형상학적 다항식 변환과 형상학적 보간을 이용한 배설형 예측 방법)

김수현;한헌수;홍민철;차형태
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2002.11a
- /
- pp.81-84
- /
- 2002
본 논문에서는 간략화 된 형상학적 다항식 변환(Morphological Polynomial Transform)과 형상학적 보간법(Morphological Interpolation)을 이용하는 비선형 예측 방법을 제안한다. 형상학적 다항식 변환은 형상학적 연산을 통해 데이터를 구조함수들의 계수들로 표현하는 변환이며, 형상학적 보간법은 형상학적 다항식 변환에 의한 계수들을 이용하여 보간하는 방법이다. 형상학적 다항식 변환을 간략화 하여 정수 연산만으로 적용할 수 있도록 개선하였으며, 보다 영상에 적합한 형상학적 보간법에 기반 한 예측 방법을 사용한다. 제안하는 예측 방법과 허프만 부호화를 사용하여 적은 비트로 영상을 손실 없이 저장할 수 있음을 실험으로 검증한다.
PDF

Introduction to the Spectrum and Spectrogram (스팩트럼과 스팩트로그램의 이해)

Jin, Sung-Min
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.19 no.2
- /
- pp.101-106
- /
- 2008
The speech signal has been put into a form suitable for storage and analysis by computer, several different operation can be performed. Filtering, sampling and quantization are the basic operation in digiting a speech signal. The waveform can be displayed, measured and even edited, and spectra can be computed using methods such as the Fast Fourier Transform (FFT), Linear predictive Coding (LPC), Cepstrum and filtering. The digitized signal also can be used to generate spectrograms. The spectrograph provide major advantages to the study of speech. So, author introduces the basic techniques for the acoustic recording, digital signal processing and the principles of spectrum and spectrogram.
PDF

A Study on the Phonemic Segmentation of an Initial Affricate (초성파찰음의 음소분류에 관한 연구)

Kim, Ki-Woon;Lee, Ki-Young;Bae, Chul-Soo;Choi, Kap-Seok
- Proceedings of the KIEE Conference
- /
- 1988.07a
- /
- pp.33-36
- /
- 1988
In this paper, the starting point of affricate is detected from the first predictor coefficient of a 12-pole linear predictive coding (LPC) analysis and phonemic segmentation is done through measuring short time energy and zero crossing rate. By this segmentation method, the duration of an aspirate can be mearsured in order to detect an aspirate or not.
PDF

Search Result 508, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)