Search | Korea Science

Guitar Tab Digit Recognition and Play using Prototype based Classification

Baek, Byung-Hyun;Lee, Hyun-Jong;Hwang, Doosung
- Journal of the Korea Society of Computer and Information
- /
- 제21권9호
- /
- pp.19-25
- /
- 2016
This paper is to recognize and play tab chords from guitar musical sheets. The musical chord area of an input image is segmented by changing the image in saturation and applying the Grabcut algorithm. Based on a template matching, our approach detects tab starting sections on a segmented musical area. The virtual block method is introduced to search blanks over chord lines and extract tab fret segments, which doesn't cause the computation loss to remove tab lines. In the experimental tests, the prototype based classification outperforms Bayesian method and the nearest neighbor rule with the whole set of training data and its performance is similar to that of the support vector machine. The experimental result shows that the prediction rate is about 99.0% and the number of selected prototypes is below 3.0%.
https://doi.org/10.9708/jksci.2016.21.9.019 인용 PDF KSCI

Korean Digit Recognition Using Cepstrum coefficients and Frequency Sensitive Competitive Learning (Cepstrum 계수와 Frequency Sensitive Competitive Learning 신경회로망을 이용한 한국어 인식.)

Lee, Su-Hyuk;Cho, Seong-Won;Choi, Gyung-Sam
- Proceedings of the KIEE Conference
- /
- 대한전기학회 1994년도 추계학술대회 논문집 학회본부
- /
- pp.329-331
- /
- 1994
In this paper, we present a speaker-dependent Korean Isolated digit recognition system. At the preprocessing step, LPC cepstral coefficients are extracted from speech signal, and are used as the input of a Frequency Sensitive Competitive Learning(FSCL) neural network. We carried out the postprocessing based on the winning-neuron histogram. Experimetal results Indicate the possibility of commercial auto-dial telephones.
PDF

A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database (훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구)

Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kang Jeom Ja
- Proceedings of the KSPS conference
- /
- 대한음성학회 2003년도 5월 학술대회지
- /
- pp.119-122
- /
- 2003
In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.
PDF

Performance Comparison of Machine Learning Algorithms for TAB Digit Recognition (타브 숫자 인식을 위한 기계 학습 알고리즘의 성능 비교)

Heo, Jaehyeok;Lee, Hyunjung;Hwang, Doosung
- KIPS Transactions on Software and Data Engineering
- /
- 제8권1호
- /
- pp.19-26
- /
- 2019
In this paper, the classification performance of learning algorithms is compared for TAB digit recognition. The TAB digits that are segmented from TAB musical notes contain TAB lines and musical symbols. The labeling method and non-linear filter are designed and applied to extract fret digits only. The shift operation of the 4 directions is applied to generate more data. The selected models are Bayesian classifier, support vector machine, prototype based learning, multi-layer perceptron, and convolutional neural network. The result shows that the mean accuracy of the Bayesian classifier is about 85.0% while that of the others reaches more than 99.0%. In addition, the convolutional neural network outperforms the others in terms of generalization and the step of the data preprocessing.
https://doi.org/10.3745/KTSDE.2019.8.1.19 인용 PDF KSCI HTML

The Optimal and Complete Prompts Lists Generation Algorithm for Connected Spoken Word Speech Corpus (연결 단어 음성 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성 알고리즘)

유하진
- The Journal of the Acoustical Society of Korea
- /
- 제23권2호
- /
- pp.187-191
- /
- 2004
This paper describes an efficient algorithm to generate compact and complete prompts lists for connected spoken words speech corpus. In building a connected spoken digit recognizer, we have to acquire speech data in various contexts. However, in many speech databases the lists are made by using random generators. We provide an efficient algorithm that can generate compact and complete lists of digits in various contexts. This paper includes the proof of optimality and completeness of the algorithm.
PDF KSCI

An Implementation of User Identification System Using Hrbrid Biomitic Distances (복합 생체 척도 거리를 이용한 사용자 인증시스템의 구현)

주동현;김두영
- Journal of the Institute of Convergence Signal Processing
- /
- 제3권2호
- /
- pp.23-29
- /
- 2002
In this paper we proposed the user identification system using hybrid biometric information and non-contact IC card to improve the accuracy of the system. The hybrid biometric information consists of the face image, the iris image, and the 4-digit voice password of user. And the non-contact IC card provides the base information of user If the distance between the sample hybrid biometric Information corresponding to the base information of user and the measured biometric information is less than the given threshold value, the identification is accepted. Otherwise it is rejected. Through the result of experimentation, this paper shows that the proposed method has better identification rate than the conventional identification method.
PDF

A study on the connected-digit recognition using MLP-VQ and Weighted DHMM (MLP-VQ와 가중 DHMM을 이용한 연결 숫자음 인식에 관한 연구)

Chung, Kwang-Woo;Hong, Kwang-Seok
- Journal of the Korean Institute of Telematics and Electronics S
- /
- 제35S권8호
- /
- pp.96-105
- /
- 1998
The aim of this paper is to propose the method of WDHMM(Weighted DHMM), using the MLP-VQ for the improvement of speaker-independent connect-digit recognition system. MLP neural-network output distribution shows a probability distribution that presents the degree of similarity between each pattern by the non-linear mapping among the input patterns and learning patterns. MLP-VQ is proposed in this paper. It generates codewords by using the output node index which can reach the highest level within MLP neural-network output distribution. Different from the old VQ, the true characteristics of this new MLP-VQ lie in that the degree of similarity between present input patterns and each learned class pattern could be reflected for the recognition model. WDHMM is also proposed. It can use the MLP neural-network output distribution as the way of weighing the symbol generation probability of DHMMs. This newly-suggested method could shorten the time of HMM parameter estimation and recognition. The reason is that it is not necessary to regard symbol generation probability as multi-dimensional normal distribution, as opposed to the old SCHMM. This could also improve the recognition ability by 14.7% higher than DHMM, owing to the increase of small caculation amount. Because it can reflect phone class relations to the recognition model. The result of my research shows that speaker-independent connected-digit recognition, using MLP-VQ and WDHMM, is 84.22%.
PDF

Speech Recognition in Noisy Environments using Wiener Filtering (Wiener Filtering을 이용한 잡음환경에서의 음성인식)

Kim, Jin-Young;Eom, Ki-Wan;Choi, Hong-Sub
- Speech Sciences
- /
- 제1권
- /
- pp.277-283
- /
- 1997
In this paper, we present a robust recognition algorithm based on the Wiener filtering method as a research tool to develop the Korean Speech recognition system. We especially used Wiener filtering method in cepstrum-domain, because the method in frequency-domain is computationally expensive and complex. Evaluation of the effectiveness of this method has been conducted in speaker-independent isolated Korean digit recognition tasks using discrete HMM speech recognition systems. In these tasks, we used 12th order weighted cepstral as a feature vector and added computer simulated white gaussian noise of different levels to clean speech signals for recognition experiments under noisy conditions. Experimental results show that the presented algorithm can provide an improvement in recognition of as much as from $5\%\;to\;\20\%$ in comparison to spectral subtraction method.
PDF

ON IMPROVING THE PERFORMANCE OF CODED SPECTRAL PARAMETERS FOR SPEECH RECOGNITION

Choi, Seung-Ho;Kim, Hong-Kook;Lee, Hwang-Soo
- Proceedings of the Acoustical Society of Korea Conference
- /
- 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
- /
- pp.250-253
- /
- 1998
In digital communicatioin networks, speech recognition systems conventionally reconstruct speech followed by extracting feature [parameters. In this paper, we consider a useful approach by incorporating speech coding parameters into the speech recognizer. Most speech coders employed in the networks represent line spectral pairs as spectral parameters. In order to improve the recognition performance of the LSP-based speech recognizer, we introduce two different ways: one is to devise weighed distance measures of LSPs and the other is to transform LSPs into a new feature set, named a pseudo-cepstrum. Experiments on speaker-independent connected-digit recognition showed that the weighted distance measures significantly improved the recognition accuracy than the unweighted one of LSPs. Especially we could obtain more improved performance by using PCEP. Compared to the conventional methods employing mel-frequency cepstral coefficients, the proposed methods achieved higher performance in recognition accuracies.
PDF

Segmentation and Recognition Methods for Touching Handwritten Digit String (접촉된 숫자열의 분할 및 인식 기법)

송성일;김황수
- Proceedings of the Korean Information Science Society Conference
- /
- 한국정보과학회 2002년도 가을 학술발표논문집 Vol.29 No.2 (2)
- /
- pp.481-483
- /
- 2002
본 논문은 숫자간 접촉이 포함된 무제약 오프라인 필기 숫자열 인식을 위한 분할 및 인식기법을 소개하고자 한다. 시스템은 숫자열에서 접촉된 성분을 추출하는 모듈, 접촉된 숫자를 분할하는 모듈과 최종적으로 분할된 결과를 조합하는 모듈로 이루어진다. 그리고, 위의 기법을 NIST 데이터에 적용하여 제안한 분할 및 인식기법의 효율성을 보여준다.
PDF

검색결과 202건 처리시간 0.027초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)