Search | Korea Science

Estimation and Weighting of Sub-band Reliability for Multi-band Speech Recognition (다중대역 음성인식을 위한 부대역 신뢰도의 추정 및 가중)

조훈영;지상문;오영환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.6
- /
- pp.552-558
- /
- 2002
Recently, based on the human speech recognition (HSR) model of Fletcher, the multi-band speech recognition has been intensively studied by many researchers. As a new automatic speech recognition (ASR) technique, the multi-band speech recognition splits the frequency domain into several sub-bands and recognizes each sub-band independently. The likelihood scores of sub-bands are weighted according to reliabilities of sub-bands and re-combined to make a final decision. This approach is known to be robust under noisy environments. When the noise is stationary a sub-band SNR can be estimated using the noise information in non-speech interval. However, if the noise is non-stationary it is not feasible to obtain the sub-band SNR. This paper proposes the inverse sub-band distance (ISD) weighting, where a distance of each sub-band is calculated by a stochastic matching of input feature vectors and hidden Markov models. The inverse distance is used as a sub-band weight. Experiments on 1500∼1800㎐ band-limited white noise and classical guitar sound revealed that the proposed method could represent the sub-band reliability effectively and improve the performance under both stationary and non-stationary band-limited noise environments.
PDF KSCI

Optimization of Mutual Information for Multiresolution Image Registration (다해상도 영상정합을 위한 상호정보 최적화)

Hong, Helen;Kim, Myoung-Hee
- Journal of the Korea Computer Graphics Society
- /
- v.7 no.1
- /
- pp.37-49
- /
- 2001
We propose an optimization of mutual information for multiresolution image registration to represent useful information as integrated form obtaining from complementary information of multi modality images. The method applies mutual information as cost function to measure the statistical dependency or information redundancy between the image intensities of corresponding pixels in both images, which is assumed to be maximal if the images are geometrically aligned. As experimental results we validate visual inspection for accuracy, changning initial condition and addictive noise for robustness. Since our method uses the native image rather than prior feature extraction, few user interaction is required to perform the registration. In addition it leads to robust density estimation and convergence as applying non-parametric density estimation and stochastic multiresolution optimization.
PDF

Content-based Video Retrieval for Illegal Copying Contents Detection using Hashing (Hashing을 이용한 불법 복제 콘텐츠 검출을 위한 내용 기반 영상 검색)

Son, Heusu;Byun, Sung-Woo;Lee, Soek-Pil
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.67 no.10
- /
- pp.1358-1363
- /
- 2018
As the usage of the Internet grows and digital media become more diversified, it has become much easier for digital contents to be distributed and shared. This makes easier to access the desired digital contents. On the other hand, there is an increasing need to protect the copyright of digital works. There are some prevalent ways to protect ownership, but they accompany several disadvantages. Among those ways, watermarking methods have the advantage of ensuring invisibility, but they also have a disadvantage that they are vulnerable to external attacks such as a noise and signal processing. In this paper, we propose the detecting method of illegal contents that is robust against external attacks to protect digital works. We extract HSV and LBP features from images and use Euclidian-based hashing techniques to shorten the searching time on high-dimensional and near-duplicate videos. According to the results, the proposed method showed higher detection rates than that of the Watermarking techniques in terms of the images with fabrications or deformations.
https://doi.org/10.5370/KIEE.2018.67.10.1358 인용 PDF KSCI

A Performance Analysis of the SIFT Matching on Simulated Geospatial Image Differences (공간 영상 처리를 위한 SIFT 매칭 기법의 성능 분석)

Oh, Jae-Hong;Lee, Hyo-Seong
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.29 no.5
- /
- pp.449-457
- /
- 2011
As automated image processing techniques have been required in multi-temporal/multi-sensor geospatial image applications, use of automated but highly invariant image matching technique has been a critical ingredient. Note that there is high possibility of geometric and spectral differences between multi-temporal/multi-sensor geospatial images due to differences in sensor, acquisition geometry, season, and weather, etc. Among many image matching techniques, the SIFT (Scale Invariant Feature Transform) is a popular method since it has been recognized to be very robust to diverse imaging conditions. Therefore, the SIFT has high potential for the geospatial image processing. This paper presents a performance test results of the SIFT on geospatial imagery by simulating various image differences such as shear, scale, rotation, intensity, noise, and spectral differences. Since a geospatial image application often requires a number of good matching points over the images, the number of matching points was analyzed with its matching positional accuracy. The test results show that the SIFT is highly invariant but could not overcome significant image differences. In addition, it guarantees no outlier-free matching such that it is highly recommended to use outlier removal techniques such as RANSAC (RANdom SAmple Consensus).
https://doi.org/10.7848/ksgpc.2011.29.5.449 인용 PDF KSCI

Early warning of hazard for pipelines by acoustic recognition using principal component analysis and one-class support vector machines

Wan, Chunfeng;Mita, Akira
- Smart Structures and Systems
- /
- v.6 no.4
- /
- pp.405-421
- /
- 2010
This paper proposes a method for early warning of hazard for pipelines. Many pipelines transport dangerous contents so that any damage incurred might lead to catastrophic consequences. However, most of these damages are usually a result of surrounding third-party activities, mainly the constructions. In order to prevent accidents and disasters, detection of potential hazards from third-party activities is indispensable. This paper focuses on recognizing the running of construction machines because they indicate the activity of the constructions. Acoustic information is applied for the recognition and a novel pipeline monitoring approach is proposed. Principal Component Analysis (PCA) is applied. The obtained Eigenvalues are regarded as the special signature and thus used for building feature vectors. One-class Support Vector Machine (SVM) is used for the classifier. The denoising ability of PCA can make it robust to noise interference, while the powerful classifying ability of SVM can provide good recognition results. Some related issues such as standardization are also studied and discussed. On-site experiments are conducted and results prove the effectiveness of the proposed early warning method. Thus the possible hazards can be prevented and the integrity of pipelines can be ensured.
https://doi.org/10.12989/sss.2010.6.4.405 인용 KSCI

Extraction of the License Plate Region Using HoG and AdaBoost (HoG와 AdaBoost를 이용한 번호판 영역 추출)

Lew, Sheen;Yi, Cui-Sheng;Lee, Wan-Joo;Lee, Byeong-Rae;Min, Kyoung-Won;Kang, Hyun-Chul
- Journal of Digital Contents Society
- /
- v.10 no.4
- /
- pp.597-604
- /
- 2009
For the improvement of license plate recognition system, correct extraction of a license plate region as well as character recognition is important. In this paper, with the analysis and classification of the error patterns in the process of plate region extraction, we tried to improve the extraction of the region using HoG(histogram of gradient) features and Adaboost. The results show that the HoG feature is robust to the noise and various types of the plates, and also is very effective to extract the region failed before.
PDF

Detection Algorithm of Lenslet Array Spot Pattern for Acquisition of Laser Wavefront (레이저 파면 획득용 Lenslet Array 점 패턴 검출 알고리즘)

Lee, Jae-Il;Lee, Young-Cheol;Huh, Joon
- Journal of the Korea Institute of Military Science and Technology
- /
- v.8 no.4 s.23
- /
- pp.110-119
- /
- 2005
In this paper, a new detection algorithm was proposed for finding the position of lenslet array spot pattern used to acquire laser wavefront. Based on the analysis of the required signal processing characteristics, we categorized into and designed four main signal processing functions. The proposed was designed in order to have robust feature against a variation of geometrical form of the spot and also implemented to have semi-automatic thresholding capability based on CCD noise analysis. For performance evaluation, we made qualitative and quantitative comparisons with Carvalho's algorithm which has been published in recent. In the given experimental spot images, the proposed could detect the spots which has 1/3 times lower than the least S/N of which Carvalho's can detect and could reach to a detection precision of 0.1 pixel at the S/N. In functional aspect, the proposed could separate all valid spots locally. From these results, the proposed could have a superior precision of location detection of spot pattern in wider S/N range.
PDF KSCI

A Robust Content-Based Image Retrieval Technique for Distorted Query Image (변형된 질의 영상에 강한 내용 기반 영상 검색 기법)

김익재;이제호;권용무;박상희
- Journal of Broadcast Engineering
- /
- v.2 no.1
- /
- pp.74-83
- /
- 1997
We have proposed a composite feature measure which combines the color and shape features of an image for image retrieval. We improved the performance of retrieval based on the efficient color quantization using the Lloyd-Max quanizer and on the Histogram matrix matching method which considers the spatial correlation of quantized color group. We also supplemented the color information using shape information with the Improved Moment Invarlants. We have tested our technique on Image database consisting of 200 actual trademark images. Our experimental results showed that our approach improved the performance compared to the previous method under the various situations such as rotation images, translation images, noise added images, gamma corrected images and so on. The efficiency of retrieval is found to be very high and experimental results are
PDF

Recognition Performance Improvement of Unsupervised Limabeam Algorithm using Post Filtering Technique

Nguyen, Dinh Cuong;Choi, Suk-Nam;Chung, Hyun-Yeol
- IEMEK Journal of Embedded Systems and Applications
- /
- v.8 no.4
- /
- pp.185-194
- /
- 2013
Abstract- In distant-talking environments, speech recognition performance degrades significantly due to noise and reverberation. Recent work of Michael L. Selzer shows that in microphone array speech recognition, the word error rate can be significantly reduced by adapting the beamformer weights to generate a sequence of features which maximizes the likelihood of the correct hypothesis. In this approach, called Likelihood Maximizing Beamforming algorithm (Limabeam), one of the method to implement this Limabeam is an UnSupervised Limabeam(USL) that can improve recognition performance in any situation of environment. From our investigation for this USL, we could see that because the performance of optimization depends strongly on the transcription output of the first recognition step, the output become unstable and this may lead lower performance. In order to improve recognition performance of USL, some post-filter techniques can be employed to obtain more correct transcription output of the first step. In this work, as a post-filtering technique for first recognition step of USL, we propose to add a Wiener-Filter combined with Feature Weighted Malahanobis Distance to improve recognition performance. We also suggest an alternative way to implement Limabeam algorithm for Hidden Markov Network (HM-Net) speech recognizer for efficient implementation. Speech recognition experiments performed in real distant-talking environment confirm the efficacy of Limabeam algorithm in HM-Net speech recognition system and also confirm the improved performance by the proposed method.
https://doi.org/10.14372/IEMEK.2013.8.4.185 인용 PDF KSCI

Vector Quantization based Speech Recognition Performance Improvement using Maximum Log Likelihood in Gaussian Distribution (가우시안 분포에서 Maximum Log Likelihood를 이용한 벡터 양자화 기반 음성 인식 성능 향상)

Chung, Kyungyong;Oh, SangYeob
- Journal of Digital Convergence
- /
- v.16 no.11
- /
- pp.335-340
- /
- 2018
Commercialized speech recognition systems that have an accuracy recognition rates are used a learning model from a type of speaker dependent isolated data. However, it has a problem that shows a decrease in the speech recognition performance according to the quantity of data in noise environments. In this paper, we proposed the vector quantization based speech recognition performance improvement using maximum log likelihood in Gaussian distribution. The proposed method is the best learning model configuration method for increasing the accuracy of speech recognition for similar speech using the vector quantization and Maximum Log Likelihood with speech characteristic extraction method. It is used a method of extracting a speech feature based on the hidden markov model. It can improve the accuracy of inaccurate speech model for speech models been produced at the existing system with the use of the proposed system may constitute a robust model for speech recognition. The proposed method shows the improved recognition accuracy in a speech recognition system.
https://doi.org/10.14400/JDC.2018.16.11.335 인용 PDF KSCI HTML

Search Result 155, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)