Search | Korea Science

A Study on the Removal of Unusual Feature Vectors in Speech Recognition (음성인식에서 특이 특징벡터의 제거에 대한 연구)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.8 no.4
- /
- pp.561-567
- /
- 2013
Some of the feature vectors for speech recognition are rare and unusual. These patterns lead to overfitting for the parameters of the speech recognition system and, as a result, cause structural risks in the system that hinder the good performance in recognition. In this paper, as a method of removing these unusual patterns, we try to exclude vectors whose norms are larger than a specified cutoff value and then train the speech recognition system. The objective of this study is to exclude as many unusual feature vectors under the condition of no significant degradation in the speech recognition error rate. For this purpose, we introduce a cutoff parameter and investigate the resultant effect on the speaker-independent speech recognition of isolated words by using FVQ(Fuzzy Vector Quantization)/HMM(Hidden Markov Model). Experimental results showed that roughly 3%~6% of the feature vectors might be considered as unusual, and therefore be excluded without deteriorating the speech recognition accuracy.
https://doi.org/10.13067/JKIECS.2013.8.4.561 인용 PDF KSCI

Video Watermarking Using Human Visual System and Wavelet Transform (인간 시각 시스템 및 웨이블릿 변환을 이용한 비디오 워터마킹)

권성근;김병주;김태수;이석환;권기룡;이건일
- Journal of Korea Multimedia Society
- /
- v.6 no.3
- /
- pp.436-443
- /
- 2003
A digital video watermarking algorithm is proposed that uses HVS and DWT. In this algorithm, each video frame is decomposed into four-level by DWT which reveals the characteristics of the human eyes and watermark is embedded into DWT coefficients using HVS. For robustness, the lowest level subbands which represent the highest frequency component are excluded in watermark embedding step and watermark is embedded into the perceptually significant coefficients (PSCs) of the rest subbands. PSCs of the baseband are selected according to the amplitude of the coefficients and PSCs of the high frequency subbands are selected by successive subband quantization (SSQ). Watermark is embedded into the PSCs of the baseband and high frequency subbands by Weber's law and spatial masking effect, respectively, for the invisibility and robustness. We tested the performance of the proposed algorithm compared with the conventional watermarking algorithm by computer simulation. Experimental results show that the proposed watermarking algorithm produces a better invisibility and robustness than the conventional algorithm.
PDF

Analysis of the Effect on the Quantization of the Network's Outputs in the Neural Processor by the Implementation of Hybrid VLSI (하이브리드 VLSI 신경망 프로세서에서의 양자화에 따른 영향 분석)

Kwon, Oh-Jun;Kim, Seong-Woo;Lee, Jong-Min
- The KIPS Transactions:PartB
- /
- v.9B no.4
- /
- pp.429-436
- /
- 2002
In order to apply the artificial neural network to the practical application, it is needed to implement it with the hardware system. It is most promising to make it with the hybrid VLSI among various possible technologies. When we Implement a trained network into the hybrid neuro-chips, it is to be performed the process of the quantization on its neuron outputs and its weights. Unfortunately this process cause the network's outputs to be distorted from the original trained outputs. In this paper we analysed in detail the statistical characteristics of the distortion. The analysis implies that the network is to be trained using the normalized input patterns and finally into the solution with the small weights to reduce the distortion of the network's outputs. We performed the experiment on an application in the time series prediction area to investigate the effectiveness of the results of the analysis. The experiment showed that the network by our method has more smaller distortion compared with the regular network.
https://doi.org/10.3745/KIPSTB.2002.9B.4.429 인용 PDF KSCI

Highly Reliable Digital Image Watermarking Based on HVS and DWT (HVS 및 DWT 기반의 고신뢰 디지털 영상 워터마킹)

권성근;권기구;하인성;권기룡;이건일
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.26 no.12A
- /
- pp.2100-2108
- /
- 2001
A digital image watermarking algorithm is proposed that uses human visual system (HVS) and discrete wavelet transform (DWT). In this algorithm, an image is decomposed into four-level by DWT which reveals the characteristics of the human eyes and watermark is embedded into DWT coefficients using HVS. For robustness, the lowest level subbands which represent the highest frequency component are excluded in watermark embedding step and watermark is embedded into the perceptually significant coefficients (PSCs) of the rest subbands. PSCs of the baseband are selected according to the amplitude of the coefficients and PSCs of the high frequency subbands are selected by successive subband quantization (SSQ). Watermark is embedded into the PSCs of the baseband and high frequency subbands by Weber\`s law and spatial masking effect, respectively, for the invisibility and robustness. We tested the performance of the proposed algorithm compared with the conventional watermarking algorithm by computer simulation. Experimental results show that the proposed watermarking algorithm produces a better invisibility and robustness than the conventional algorithm.
PDF

Perceptual and Adaptive Quantization of Line Spectral Frequency Parameters (선 스펙트럼 주파수의 청각 적응 부호화)

한우진;김은경;오영환
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.8
- /
- pp.68-77
- /
- 2000
Line special frequency (LSF) parameters have been widely used in low bit-rate speech coding due to their efficiency for representing the short-time speech spectrum. In this paper, a new distance measure based on the masking properties of human ear is proposed for quantizing LSF parameters whereas most conventional quantization methods are based on the weighted Euclidean distance measure. The proposed method derives the perceptual distance measure from the definition of noise-to-mask ratio (NMR) which has high correspondence with the actual distortion received in the human ear and uses it for quantizing LSF parameters. In addition, we propose an adaptive bit allocation scheme, which allocates minimal bits to LSF parameters maintaining the perceptual transparency of given speech frame for reducing the average bit-rates. For the performance evaluation, we has shown the ratio of perceptually transparent frames and the corresponding average bit-rates for the conventional and proposed methods. By jointly combining the proposed distance measure and adaptive bit allocation scheme, the proposed system requires only 770 bps for obtaining 95.5% perceptually transparent frames, while the conventional systems produce 89.9% at even 1800 bps.
PDF

A Steganography Method Improving Image Quality and Minimizing Image Degradation (영상의 화질 개선과 열화측정 시간을 최소화하는 스테가노그라피 방법)

Choi, YongSoo;Kim, JangHwan
- Journal of Digital Contents Society
- /
- v.17 no.5
- /
- pp.433-439
- /
- 2016
In this paper, we propose a optimized steganography how to improve the image degradation of the existing data hiding techniques. This method operates in the compressed domain(JPEG) of an image. Most of the current information concealment methods generally change the coefficients to hide information. And several methods have tried to improve the performance of a typical steganography method such as F5 including a matrix encoding. Those papers achieved the object of reducing the distortion which is generated as hiding data in coefficients of compressed domain. In the proposed paper we analyzed the effect of the quantization table for hiding the data in the compressed domain. As a result, it found that can decrease the distortion that occur in the application of steganography techniques. This paper provides a little (Maximum: approximately 6.5%) further improved results in terms of image quality in a data hiding on compressed domain. Developed algorithm help improve the data hiding performance of compressed image other than the JPEG.
https://doi.org/10.9728/dcs.2016.17.5.433 인용 PDF KSCI

A Proposal on IT Based Method of Substantiation and Quantization for Pronunciation Accuracy Improvement Methods (IT 기술을 적용한 발음의 정확성 향상 방법들의 효용성 입증 및 정량화 방법 제안)

Kim, Bong-Hyun;Cho, Dong-Uk
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.8B
- /
- pp.979-985
- /
- 2011
One of the most important means in modem NQ (Network Quotient) emphasized society is the communication skill. Therefore, it is mostly necessary to accurately express his or her own idea and maximize own communication competence. For this, efforts on improving pronunciation accuracy, such as pronunciation stretching practice and vocal cords reflex point acupressure therapy, have been devoted for communication competence in daily life. However, objective and positive method to substantiate such efforts has not yet been studied to find if the methods can improve pronunciation accuracy in effect. We, therefore, propose an IT based method of substantiation and quantization for such pronunciation accuracy improvement methods. Voice analysis on voice data sample of 30 males in 20s, before and after pronunciation stretching practice and vocal cords reflex point acupressure, has been performed in this paper.
https://doi.org/10.7840/KICS.2011.36B.8.979 인용 PDF KSCI

서정일;우석훈;원치선
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.22 no.8
- /
- pp.1814-1822
- /
- 1997
In this paper, we present a new digital signature for copyright protection of digital images. The proposed algorithm is designed to be more robust to both the compression (quantization) errors and the illegal signature attack by a third party. More specifically, to maximize the watermaking effect, we embed the watermark by randomly adding or subtracking a fixed number instead of executing the XORs. Also, to improve the reliability of the watermark detection, we extact the watermark only on some image blocks, which are less sensitive to the compression error. Futhermore, the unrecovered compression errors are further detected by the Hypothesis testing. The illegal signalture attack of a third party is also protected by using some probabilistic decisions of the MSE between the orignal image and the signed image. Experimental results show that the peroposed algorithm is more robust to the quantization errors and illegal signature attack by a third party.
PDF

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC (FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.5 no.4
- /
- pp.363-371
- /
- 2010
In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate for the speaker-independent speech recognition, we study the effect of spectral tilt on the Fourier magnitude spectrum en route to the extraction of MFCC. The effect of FIR filtering on the speech signal on the speech recognition is also investigated in parallel. Evaluation of the proposed methods are performed by two independent ways of the Fisher discriminant objective function and speech recognition test by hidden Markov model with fuzzy vector quantization. From the experiments, the recognition error rate is found to show about 10% relative improvements over the conventional method by an appropriate choice of the tilt factor.
PDF KSCI

Fault Diagnosis of a Rotating Blade using HMM/ANN Hybrid Model (HMM/ANN복합 모델을 이용한 회전 블레이드의 결함 진단)

Kim, Jong Su;Yoo, Hong Hee
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.23 no.9
- /
- pp.814-822
- /
- 2013
For the fault diagnosis of a mechanical system, pattern recognition methods have being used frequently in recent research. Hidden Markov model(HMM) and artificial neural network(ANN) are typical examples of pattern recognition methods employed for the fault diagnosis of a mechanical system. In this paper, a hybrid method that combines HMM and ANN for the fault diagnosis of a mechanical system is introduced. A rotating blade which is used for a wind turbine is employed for the fault diagnosis. Using the HMM/ANN hybrid model along with the numerical model of the rotating blade, the location and depth of a crack as well as its presence are identified. Also the effect of signal to noise ratio, crack location and crack size on the success rate of the identification is investigated.
https://doi.org/10.5050/KSNVE.2013.23.9.814 인용 PDF KSCI

Search Result 128, Processing Time 0.028 seconds

A Study on the Removal of Unusual Feature Vectors in Speech Recognition (음성인식에서 특이 특징벡터의 제거에 대한 연구)

Video Watermarking Using Human Visual System and Wavelet Transform (인간 시각 시스템 및 웨이블릿 변환을 이용한 비디오 워터마킹)

Analysis of the Effect on the Quantization of the Network's Outputs in the Neural Processor by the Implementation of Hybrid VLSI (하이브리드 VLSI 신경망 프로세서에서의 양자화에 따른 영향 분석)

Highly Reliable Digital Image Watermarking Based on HVS and DWT (HVS 및 DWT 기반의 고신뢰 디지털 영상 워터마킹)

Perceptual and Adaptive Quantization of Line Spectral Frequency Parameters (선 스펙트럼 주파수의 청각 적응 부호화)

A Steganography Method Improving Image Quality and Minimizing Image Degradation (영상의 화질 개선과 열화측정 시간을 최소화하는 스테가노그라피 방법)

A Proposal on IT Based Method of Substantiation and Quantization for Pronunciation Accuracy Improvement Methods (IT 기술을 적용한 발음의 정확성 향상 방법들의 효용성 입증 및 정량화 방법 제안)

A new watermark for copyright protection of digital images (디지철 영상의 저작권 보호를 위한 새로운 서명 문양)

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC (FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과)

Fault Diagnosis of a Rotating Blade using HMM/ANN Hybrid Model (HMM/ANN복합 모델을 이용한 회전 블레이드의 결함 진단)

Search Result 128, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)