Search | Korea Science

A Study on the Speech Recognition Moduleas Design Using HMM Speech Recognition Algorithm (HMM(Hidden Markov Model) 음성인식 알고리즘을 이용한 효율적인 음성인식 모듈 개발 설계에 관한 연구)

김정훈;류홍석;강재명;강성인;이상배
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2002.12a
- /
- pp.337-340
- /
- 2002
본 논문에서는 휠체어 시스템에 화자 독립 고립단어 인식을 위한 임베디드 시스템 설계에 관한 내용을 서술한다. 실제 환경에서는 잡음이 포함되어 있어 인식률을 저하시키므로, 잡음을 제거하는 방식 중 가장 간단한 방식인 스펙트럼 차감법(Spectral subtraction method)을 사용하여 잡음을 제거했다 전처리 단계에서는 12차 LPC&Cepstrum 방식을 사용했고, 인식 알고리즘은 DHMM (Discrete Hidden Markov Model)을 전반부 인식기로 사용했다. 이 알고리즘을 적용하기 위해서는 데이터 간소화를 위해 벡터양자화(Vector Quantization) 처리가 전제되어야한다 또한 인식알고리즘은 인식률을 향상을 위해 후처리 인식기로 신경망(MLP:Multi-layer Perceptron)을 통해서 인식률을 향상시켰다 화자 독립 시스템에 맞는 인식 단어의 구성은 총 7개단어로 남녀 총 25명 목소리로 구성하였다. 그리고 하드웨어 구성은 32-bits floating point 방식인 TMS320C32를 적용했고, 메모리 부분은 4Mbyte로 설계를 했으며, 메인보드의 설계는 현재 완성 단계에 있다.

Noise Elimination Using Improved MFCC and Gaussian Noise Deviation Estimation

Sang-Yeob, Oh
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.1
- /
- pp.87-92
- /
- 2023
With the continuous development of the speech recognition system, the recognition rate for speech has developed rapidly, but it has a disadvantage in that it cannot accurately recognize the voice due to the noise generated by mixing various voices with the noise in the use environment. In order to increase the vocabulary recognition rate when processing speech with environmental noise, noise must be removed. Even in the existing HMM, CHMM, GMM, and DNN applied with AI models, unexpected noise occurs or quantization noise is basically added to the digital signal. When this happens, the source signal is altered or corrupted, which lowers the recognition rate. To solve this problem, each voice In order to efficiently extract the features of the speech signal for the frame, the MFCC was improved and processed. To remove the noise from the speech signal, the noise removal method using the Gaussian model applied noise deviation estimation was improved and applied. The performance evaluation of the proposed model was processed using a cross-correlation coefficient to evaluate the accuracy of speech. As a result of evaluating the recognition rate of the proposed method, it was confirmed that the difference in the average value of the correlation coefficient was improved by 0.53 dB.
https://doi.org/10.9708/jksci.2023.28.01.087 인용 PDF HTML

A Study on Image Coding using the Human Visual System and DCT (시각특성과 DCT를 이용한 영상부호화에 관한 연구)

남승진;최성남;전중남;박규태
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.17 no.4
- /
- pp.323-335
- /
- 1992
In this paper, an adaptive cosine transform coding scheme which incorporate human visual properties into the coding scheme is investigated. Human vision is relatively sensitive to mid-frequency band, and insensitive to very low and very high frequency band. These property was mathematically modelled with MTF(Modulation Transfer Function) through many psychovisual experiment. DCT transforms energy in spatial domain into frequency domain, so can exploit the MTF very efficiently. Another well-known visual characteristics is spatial masking effect that visibility of noise is less in regions of high activity than in regions of low activity. Proposed coding scheme imploys quantization matrix which represent the properties of these spatial frequency response of human vision, and adaptively quality of an image. To compute the activity index of an image block, simple operation is performed in spatial domain, and according to activity index. block of low activity region is more exactly quantized relatively than that of high activity region. Results showed that, at low bit rate, the subjective quality of the reconstructed images by proposed coding scheme is acceptible than that of coding scheme without HVS properties.
PDF

Audio Fingerprint Extraction Method Using Multi-Level Quantization Scheme (다중 레벨 양자화 기법을 적용한 오디오 핑거프린트 추출 방법)

Song Won-Sik;Park Man-Soo;Kim Hoi-Rin
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.4
- /
- pp.151-158
- /
- 2006
In this paper, we proposed a new audio fingerprint extraction method, based on Philips' music retrieval algorithm, which uses the energy difference of neighboring filter-bank and probabilistic characteristics of music. Since Philips method uses too many filter-banks in limited frequency band, it may cause audio fingerprints to be highly sensitive to additive noises and to have too high correlation between neighboring bands. The proposed method improves robustness to noises by reducing the number of filter-banks while it maintains the discriminative power by representing the energy difference of bands with 2 bits where the quantization levels are determined by probabilistic characteristics. The correlation which exists among 4 different levels in 2 bits is not only utilized in similarity measurement. but also in efficient reduction of searching area. Experiments show that the proposed method is not only more robust to various environmental noises (street, department, car, office, and restaurant), but also takes less time for database search than Philips in the case where music is highly degraded.
https://doi.org/10.7776/ASK.2006.25.4.151 인용 PDF KSCI

Noise Modeling for CR Images of High-strength Materials (고강도매질 CR 영상의 잡음 모델링)

Hwang, Jung-Won;Hwang, Jae-Ho
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.45 no.5
- /
- pp.95-102
- /
- 2008
This paper presents an appropriate approach for modeling noise in Computed Radiography(CR) images of high strength materials. The approach is specifically designed for types of noise with the statistical and nonlinear properties. CR images Ere degraded even before they are encoded by computer process. Various types of noise often contribute to contaminate radiography image, although they are detected on digitalization. Quantum noise, which is Poisson distributed, is a shot noise, but the photon distribution on Image Plate(IP) of CR system is not always Poisson process. The statistical properties are relative and case-dependant due to its material characteristics. The usual assumption of a distribution of Poisson, binomial and Gaussian statistics are considered. Nonlinear effect is also represented in the process of statistical noise model. It leads to estimate the noise variance in regions from high to low intensity, specifying analytical model. The analysis approach is tested on a database of steel tube step-wedge CR images. The results are available for the comparative parameter studies which measure noise coherence, distribution, signal/noise ratios(SNR) and nonlinear interpolation.
PDF KSCI

A Study on the Performance of a Modified Binary Quantized first-Order DPLL (2단 양자화기를 사용한 1차 DPLL의 성능 개선에 관한 연구)

강치우;김진헌
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.21 no.3
- /
- pp.6-12
- /
- 1984
The basic binary quantized first-order digital phase locked loop (DPLL) is modified in order to reduce the aquisition time and steadyftate phase error. Adding the loop that corrects the phase difference by detecting the falling zero-crossing time, an effort for the improving the performance is performed and the performance compared with that of the basic DPLL. Using a graphical method, the phase locking processes of the modified DPLL for a phase step and a frequency step input are depicted visually in the absence of noise. The performance of the modified DPLL for a sinusoidal input added narrow band random noise is evaluated using the Chapman-Kolmogorov equation. This approach is verified by direct computer simulation. The steady-state phase error and the average aquisition time of the modified DPLL are compared with those of the basic DPLL, It is shown that the aquisition time of the modified DPLL is shortened about twice, also, as signal to noise ratio increases, the effect of the modification increases and the steady-state phase error approaches to zero.
PDF

Novel Polar Transmitter with 2-Bit Sigma-Delta Modulation (2비트 시그마-델타 변조를 이용한 새로운 폴라 트랜스미터)

Lim, Ji-Youn;Cheon, Sang-Hoon;Kim, Kyeong-Hak;Hong, Song-Cheol;Kim, Dong-Wook
- The Journal of Korean Institute of Electromagnetic Engineering and Science
- /
- v.18 no.8
- /
- pp.970-976
- /
- 2007
This paper presents a novel polar transmitter architecture with a 2-bit sigma-delta modulator. In the proposed architecture, the 2-bit sigma-delta modulator is introduced to suppress quantization noise of conventional sigma-delta modulator. The power amplifier configuration is also modified in a binary form to accommodate the 2-bit digitized envelope signal. The Ptolemy simulation results of the proposed structure show that the spectral property is greatly improved in full transmit band of EDGE system. The fine quantization scheme of the 2-bit modulator lowers the noise level by 10dB without increasing the over-sampling ratio, which may be obtained if the over-sampling ratio increases twofold. Dynamic range is also enhanced up to 5dB owing to the new form of the power amplifier in the transmitter.
https://doi.org/10.5515/KJKIEES.2007.18.8.970 인용 PDF KSCI

Post-filtering in Low Bit Rate Moving Picture Coding, and Subjective and Objective Evaluation of Post-filtering (저 전송률 동화상 압축에서 후처리 방법 및 후처리 방법의 주관적 객관적 평가)

이영렬;김윤수;박현욱
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.24 no.8B
- /
- pp.1518-1531
- /
- 1999
The reconstructed images from highly compressed MPEG or H.263 data have noticeable image degradations, such as blocking artifacts near the block boundaries, corner outliers at cross points of blocks, and ringing noise near image edges, because the MPEG or H.263 quantizes the transformed coefficients of 8$\times$8 pixel blocks. A post-processing algorithm has been proposed by authors to reduce quantization effects, such as blocking artifacts, corner outliers, and ringing noise, in MPEG-decompressed images. Our signal-adaptive post-processing algorithm reduces the quantization effects adaptively by using both spatial frequency and temporal information extracted from the compressed data. The blocking artifacts are reduced by one-dimensional (1-D) horizontal and vertical low pass filtering (LPF), and the ringing noise is reduced by two-dimensional (2-D) signal-adaptive filtering (SAF). A comparison study of the subjective quality evaluation using modified single stimulus method (MSSM), the objective quality evaluation (PSNR) and the computation complexity analysis between the signal-adaptive post-processing algorithm and the MPEG-4 VM (Verification Model) post-processing algorithm is performed by computer simulation with several MPEG-4 image sequences. According to the comparison study, the subjective image qualities of both algorithms are similar, whereas the PSNR and the comparison complexity analysis of the signal-adaptive post-processing algorithm shows better performance than the VM post-processing algorithm.
PDF

Loop-Filtering for Reducing Comer outlier (모서리 잡음 제거를 위한 Loop 필터링 기법)

홍윤표;전병우
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.5
- /
- pp.217-223
- /
- 2004
In block-based lossy video compression, severe quantization causes discontinuities along block boundaries so that annoying blocking artifacts are visible in decoded video images. These blocking artifacts significantly decrease the subjective image quality. In order to reduce the blocking artifacts in decoded images, many algorithms have been proposed. However studies on so called comer outlier, have been very limited. Corner outliers make image edges look disconnected from those of neighboring blocks at cross block boundary. In order to solve this problem we propose a corner outlier detection and compensation algorithm as loop-filtering in spatial domain. Experiment results show that the proposed method provides much improved subjective image quality.
PDF KSCI

Efficient Correlation Noise Modeling and Performance Analysis for Distributed Video Coding System (분산 동영상 부호화 시스템을 위한 효과적인 상관 잡음 모델링 및 성능평가)

Moon, Hak-Soo;Lee, Chang-Woo;Lee, Seong-Won
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.6C
- /
- pp.368-375
- /
- 2011
In the distributed video coding system, the parity bits, which are generated in encoders, are used to reconstruct Wyner-Ziv frames. Since the original Wyner-Ziv frames are not known in decoders, the efficient correlation noise modeling for turbo or LDPC code is necessary. In this paper, an efficient correlation noise modeling method is proposed and the performance is analyzed. The method to estimate the quantization parameters for key frames, which are encoded using H.264 intraframe coding technique, is also proposed. The performance of the proposed system is evaluated by extensive computer simulations.
https://doi.org/10.7840/KICS.2011.36C.6.368 인용 PDF KSCI

Search Result 197, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)