Search | Korea Science

Dimensionality Reduction in Speech Recognition by Principal Component Analysis (음성인식에서 주 성분 분석에 의한 차원 저감)

Lee, Chang-Young
- The Journal of the Korea institute of electronic communication sciences
- /
- v.8 no.9
- /
- pp.1299-1305
- /
- 2013
In this paper, we investigate a method of reducing the computational cost in speech recognition by dimensionality reduction of MFCC feature vectors. Eigendecomposition of the feature vectors renders linear transformation of the vectors in such a way that puts the vector components in order of variances. The first component has the largest variance and hence serves as the most important one in relevant pattern classification. Therefore, we might consider a method of reducing the computational cost and achieving no degradation of the recognition performance at the same time by dimensionality reduction through exclusion of the least-variance components. Experimental results show that the MFCC components might be reduced by about half without significant adverse effect on the recognition error rate.
https://doi.org/10.13067/JKIECS.2013.8.9.1299 인용 PDF KSCI

An Effective Error-Concealment Approach for Video Data Transmission over Internet (인터넷상의 비디오 데이타 전송에 효과적인 오류 은닉 기법)

김진옥
- Journal of KIISE:Computing Practices and Letters
- /
- v.8 no.6
- /
- pp.736-745
- /
- 2002
In network delivery of compressed video, packets may be lost if the channel is unreliable like Internet. Such losses tend to of cur in burst like continuous bit-stream error. In this paper, we propose an effective error-concealment approach to which an error resilient video encoding approach is applied against burst errors and which reduces a complexity of error concealment at the decoder using data hiding. To improve the performance of error concealment, a temporal and spatial error resilient video encoding approach at encoder is developed to be robust against burst errors. For spatial area of error concealment, block shuffling scheme is introduced to isolate erroneous blocks caused by packet losses. For temporal area of error concealment, we embed parity bits in content data for motion vectors between intra frames or continuous inter frames and recovery loss packet with it at decoder after transmission While error concealment is performed on error blocks of video data at decoder, it is computationally costly to interpolate error video block using neighboring information. So, in this paper, a set of feature are extracted at the encoder and embedded imperceptibly into the original media. If some part of the media data is damaged during transmission, the embedded features can be extracted and used for recovery of lost data with bi-direction interpolation. The use of data hiding leads to reduced complexity at the decoder. Experimental results suggest that our approach can achieve a reasonable quality for packet loss up to 30% over a wide range of video materials.
PDF KSCI

Channel Estimation for Block-Based Distributed Video Coding (블록 기반의 분산 비디오 코딩을 위한 채널 예측 기법)

Min, Kyung-Yeon;Park, Sea-Nae;Yoo, Sung-Eun;Sim, Dong-Gyu;Jeon, Byeung-Woo
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.48 no.2
- /
- pp.53-64
- /
- 2011
In this paper, we propose a channel estimation of side information method based received motion vectors for distributed video coding. The proposed decoder estimates motion vectors of side information and transmits it to the encoder. As the proposed encoder generates side information which is the same to one in the decoder with received motion vectors, accuracy of side information of the decoder is assessed and it is transmitted to decoder. The proposed decoder can also estimate accurate crossover probability with received error information. As the proposed method conducts correct belief propagation, computational complexity of the channel decoder decreases and error correction capability is significantly improved with the smaller amount of parity bits. Experimental results show that the proposed algorithm is better in rate-distortion performance and it is faster than several conventional distributed video coding methods.
PDF KSCI

Machine Learning-based Quality Control and Error Correction Using Homogeneous Temporal Data Collected by IoT Sensors (IoT센서로 수집된 균질 시간 데이터를 이용한 기계학습 기반의 품질관리 및 데이터 보정)

Kim, Hye-Jin;Lee, Hyeon Soo;Choi, Byung Jin;Kim, Yong-Hyuk
- Journal of the Korea Convergence Society
- /
- v.10 no.4
- /
- pp.17-23
- /
- 2019
In this paper, quality control (QC) is applied to each meteorological element of weather data collected from seven IoT sensors such as temperature. In addition, we propose a method for estimating the data regarded as error by means of machine learning. The collected meteorological data was linearly interpolated based on the basic QC results, and then machine learning-based QC was performed. Support vector regression, decision table, and multilayer perceptron were used as machine learning techniques. We confirmed that the mean absolute error (MAE) of the machine learning models through the basic QC is 21% lower than that of models without basic QC. In addition, when the support vector regression model was compared with other machine learning methods, it was found that the MAE is 24% lower than that of the multilayer neural network and 58% lower than that of the decision table on average.
https://doi.org/10.15207/JKCS.2019.10.4.017 인용 PDF KSCI HTML

The Postprocessor of Automatic Segmentation for Synthesis Unit Generation (합성단위 자동생성을 위한 자동 음소 분할기 후처리에 대한 연구)

박은영;김상훈;정재호
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.7
- /
- pp.50-56
- /
- 1998
본 논문은 자동 음소 분할기의 음소 경계 오류를 보상하기 위한 후처리 (Postprocessing)에 관한 연구이다. 이는 현재 음성 합성을 위한 음성/언어학적 연구, 운율 모델링, 합성단위 자동 생성 연구 등에 대량의 음소 단위 분절과 음소 레이블링된 데이터의 필요성에 따른 연구의 일환이다. 특히 수작업에 의한 분절 및 레이블링은 일관성의 유지가 어렵고 긴 시간이 소요되므로 자동 분절 기술이 더욱 중요시 되고 있다. 따라서, 본 논문은 자동 분절 경계의 오류 범위를 줄일 수 있는 후처리기를 제안하여 자동 분절 결과를 직접 합성 단위로 사용할 수 있고 대량의 합성용 운율 데이터 베이스 구축에 유용함을 기술한다. 제안된 후처리기는 수작업으로 조정된 데이터의 특징 벡터를 다층 신경회로망 (MLP:Multi-layer perceptron)을 통해 학습을 한 후, ETRI(Electronics and Telecommunication Research Institute)에서 개발된 음성 언어 번역 시스템을 이용한 자동 분절 결과와 후처리기인 MLP를 이용하여 새로운 음소 경계를 추출한다. 고립단어로 발성된 합성 데이터베이스에서 후처리기로 보정된 분절 결과는 음성 언어 번역 시스템의 분할율보 다 약 25%의 향상된 성능을 보였으며, 절대 오류(｜Hand label position-Auto label position ｜)는 약 39%가 향상되었다. 이는 MLP를 이용한 후처리기로 자동 분절 오류의 범위를 줄 일 수 있고, 대량의 합성용 운율 데이터 베이스 구축 및 합성 단위의 자동생성에 이용될 수 있음을 보이는 것이다.
PDF

Edge-Directional Joint Disparity-Motion Estimation of Stereoscopic Sequences (경계 방향성을 고려한 스테레오 동영상의 움직임-변이 동시추정 기법)

김용태;서형갑;박창섭;이재호;손광훈
- Journal of Broadcast Engineering
- /
- v.9 no.3
- /
- pp.196-206
- /
- 2004
This paper presents an efficient joint disparity-motion estimation algorithm for stereo sequence CODEC. Disparity vectors are estimated by the left and right motion vectors and previous disparity vectors for every frame. In order to obtain more accurate disparity vectors. we include a spatial prediction Process after the feint estimation. From joint estimation and spatial prediction, we can obtain accurate disparity vectors and then Increase coding efficiency. Finally, we proposed the backward quadtree decomposition. which helps the encoder to have a more detailed disparity vector map without transmitting additional coding bits for quadtree information. We confirmed superior performance of the proposed method through computer simulation.
PDF KSCI

Segmentation of Continuous Speech based on PCA of Feature Vectors (주요고유성분분석을 이용한 연속음성의 세그멘테이션)

신옥근
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.2
- /
- pp.40-45
- /
- 2000
In speech corpus generation and speech recognition, it is sometimes needed to segment the input speech data without any prior knowledge. A method to accomplish this kind of segmentation, often called as blind segmentation, or acoustic segmentation, is to find boundaries which minimize the Euclidean distances among the feature vectors of each segments. However, the use of this metric alone is prone to errors because of the fluctuations or variations of the feature vectors within a segment. In this paper, we introduce the principal component analysis method to take the trend of feature vectors into consideration, so that the proposed distance measure be the distance between feature vectors and their projected points on the principal components. The proposed distance measure is applied in the LBDP(level building dynamic programming) algorithm for an experimentation of continuous speech segmentation. The result was rather promising, resulting in 3-6% reduction in deletion rate compared to the pure Euclidean measure.
PDF

Learning and Performance Comparison of Multi-class Classification Problems based on Support Vector Machine (지지벡터기계를 이용한 다중 분류 문제의 학습과 성능 비교)

Hwang, Doo-Sung
- Journal of Korea Multimedia Society
- /
- v.11 no.7
- /
- pp.1035-1042
- /
- 2008
The support vector machine, as a binary classifier, is known to surpass the other classifiers only in binary classification problems through the various experiments. Even though its theory is based on the maximal margin classifier, the support vector machine approach cannot be easily extended to the multi-classification problems. In this paper, we review the extension techniques of the support vector machine toward the multi-classification and do the performance comparison. Depending on the data decomposition of the training data, the support vector machine is easily adapted for a multi-classification problem without modifying the intrinsic characteristics of the binary classifier. The performance is evaluated on a collection of the benchmark data sets and compared according to the selected teaming strategies, the training time, and the results of the neural network with the backpropagation teaming. The experiments suggest that the support vector machine is applicable and effective in the general multi-class classification problems when compared to the results of the neural network.
PDF

Developing a New Algorithm for Conversational Agent to Detect Recognition Error and Neologism Meaning: Utilizing Korean Syllable-based Word Similarity (대화형 에이전트 인식오류 및 신조어 탐지를 위한 알고리즘 개발: 한글 음절 분리 기반의 단어 유사도 활용)

Jung-Won Lee;Il Im
- Journal of Intelligence and Information Systems
- /
- v.29 no.3
- /
- pp.267-286
- /
- 2023
The conversational agents such as AI speakers utilize voice conversation for human-computer interaction. Voice recognition errors often occur in conversational situations. Recognition errors in user utterance records can be categorized into two types. The first type is misrecognition errors, where the agent fails to recognize the user's speech entirely. The second type is misinterpretation errors, where the user's speech is recognized and services are provided, but the interpretation differs from the user's intention. Among these, misinterpretation errors require separate error detection as they are recorded as successful service interactions. In this study, various text separation methods were applied to detect misinterpretation. For each of these text separation methods, the similarity of consecutive speech pairs using word embedding and document embedding techniques, which convert words and documents into vectors. This approach goes beyond simple word-based similarity calculation to explore a new method for detecting misinterpretation errors. The research method involved utilizing real user utterance records to train and develop a detection model by applying patterns of misinterpretation error causes. The results revealed that the most significant analysis result was obtained through initial consonant extraction for detecting misinterpretation errors caused by the use of unregistered neologisms. Through comparison with other separation methods, different error types could be observed. This study has two main implications. First, for misinterpretation errors that are difficult to detect due to lack of recognition, the study proposed diverse text separation methods and found a novel method that improved performance remarkably. Second, if this is applied to conversational agents or voice recognition services requiring neologism detection, patterns of errors occurring from the voice recognition stage can be specified. The study proposed and verified that even if not categorized as errors, services can be provided according to user-desired results.
https://doi.org/10.13088/jiis.2023.29.3.267 인용 PDF

A Motion Vector Recovery Method based on Optical Flow for Temporal Error Concealment in the H.264 Standard (H.264에서 에러은닉을 위한 OPtical Flow기반의 움직임벡터 복원 기법)

Kim, Dong-Hyung;Jeong, Je-Chang
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.31 no.2C
- /
- pp.148-155
- /
- 2006
For the improvement of coding efficiency, the H.264 standard uses new coding tools which are not used in previous coding standards. Among new coding tools, motion estimation using smaller block sizes leads to higher correlation between the motion vectors of neighboring blocks. This characteristic of H.264 is useful for the motion vector recovery. In this paper, we propose the motion vector recovery method based on optical flow. Since the proposed method estimates the optical flow velocity vector from more accurate initial value and optical flow region is limited to 16$\times$16 block size, we can alleviate the complexity of computation of optical flow velocity. Simulation results show that our proposed method gives higher objective and subjective video quality than previous methods.
PDF KSCI

Search Result 215, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)