Search | Korea Science

Lightweight Deep Learning Model of Optical Character Recognition for Laundry Management (세탁물 관리를 위한 문자인식 딥러닝 모델 경량화)

Im, Seung-Jin;Lee, Sang-Hyeop;Park, Jang-Sik
- Journal of the Korean Society of Industry Convergence
- /
- v.25 no.6_3
- /
- pp.1285-1291
- /
- 2022
In this paper, we propose a low-cost, low-power embedded environment-based deep learning lightweight model for input images to recognize laundry management codes. Laundry franchise companies mainly use barcode recognition-based systems to record laundry consignee information and laundry information for laundry collection management. Conventional laundry collection management systems using barcodes require barcode printing costs, and due to barcode damage and contamination, it is necessary to improve the cost of reprinting the barcode book in its entirety of 1 billion won annually. It is also difficult to do. Recognition performance is improved by applying the VGG model with 7 layers, which is a reduced-transformation of the VGGNet model for number recognition. As a result of the numerical recognition experiment of service parts drawings, the proposed method obtained a significantly improved result over the conventional method with an F1-Score of 0.95.
https://doi.org/10.21289/KSIC.2022.25.6.1285 인용 PDF KSCI HTML

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair (반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자음 인식의 성능 향상)

Kim Dong-Ok;Park No-Jin
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.9 no.8
- /
- pp.1625-1631
- /
- 2005
This paper describes an optimization of a language model and an acoustic model that improve the ability of speech recognition with Korean nit digit. Recognition errors of the language model are decreasing by analysis of the grammatical feature of korean unit digits, and then is made up of fsn-node with a disyllable. Acoustic model make use of demi-syllable pair to decrease recognition errors by inaccuracy division of a phone, a syllable because of a monosyllable, a short pronunciation and an articulation. we have used the k-means clustering algorithm with the transformed successive state splining in feature level for the efficient modelling of the feature of recognition unit . As a result of experimentations, $10.5\%$ recognition rate is raised in the case of the proposed language model. The demi-syllable pair with an acoustic model increased $12.5\%$ recognition rate and $1.5\%$ recognition rate is improved in transformed successive state splitting.
PDF KSCI

A Noise Robust Speech Recognition Method Using Model Compensation Based on Speech Enhancement (음성 개선 기반의 모델 보상 기법을 이용한 강인한 잡음 음성 인식)

Shen, Guang-Hu;Jung, Ho-Youl;Chung, Hyun-Yeol
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.4
- /
- pp.191-199
- /
- 2008
In this paper, we propose a MWF-PMC noise processing method which enhances the input speech by using Mel-warped Wiener Filtering (MWF) at pre-processing stage and compensates the recognition model by using PMC (Parallel Model Combination) at post-processing stage for speech recognition in noisy environments. The PMC uses the residual noise extracted from the silence region of enhanced speech at pre-processing stage to compensate the clean speech model and thus this method is considered to improve the performance of speech recognition in noisy environments. For recognition experiments we dew.-sampled KLE PBW (Phoneme Balanced Words) 452 word speech data to 8kHz and made 5 different SNR levels of noisy speech, i.e., 0dB. 5dB, 10dB, 15dB and 20dB, by adding Subway, Car and Exhibition noise to clean speech. From the recognition results, we could confirm the effectiveness of the proposed MWF-PMC method by obtaining the improved recognition performances over all compared with the existing combined methods.
https://doi.org/10.7776/ASK.2008.27.4.191 인용 PDF KSCI

Martial Arts Moves Recognition Method Based on Visual Image

Husheng, Zhou
- Journal of Information Processing Systems
- /
- v.18 no.6
- /
- pp.813-821
- /
- 2022
Intelligent monitoring, life entertainment, medical rehabilitation, and other fields are only a few examples where visual image technology is becoming increasingly sophisticated and playing a significant role. Recognizing Wushu, or martial arts, movements through the use of visual image technology helps promote and develop Wushu. In order to segment and extract the signals of Wushu movements, this study analyzes the denoising of the original data using the wavelet transform and provides a sliding window data segmentation technique. Wushu movement The Wushu movement recognition model is built based on the hidden Markov model (HMM). The HMM model is trained and taught with the help of the Baum-Welch algorithm, which is then enhanced using the frequency weighted training approach and the mean training method. To identify the dynamic Wushu movement, the Viterbi algorithm is used to determine the probability of the optimal state sequence for each Wushu movement model. In light of the foregoing, an HMM-based martial arts movements recognition model is developed. The recognition accuracy of the HMM model increases to 99.60% when the number of samples is 4,000, which is greater than the accuracy of the SVM (by 0.94%), the CNN (by 1.12%), and the BP (by 1.14%). From what has been discussed, it appears that the suggested system for detecting martial arts acts is trustworthy and effective, and that it may contribute to the growth of martial arts.
https://doi.org/10.3745/JIPS.02.0188 인용 PDF KSCI

A Study on the Multilingual Speech Recognition using International Phonetic Language (IPA를 활용한 다국어 음성 인식에 관한 연구)

Kim, Suk-Dong;Kim, Woo-Sung;Woo, In-Sung
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.7
- /
- pp.3267-3274
- /
- 2011
Recently, speech recognition technology has dramatically developed, with the increase in the user environment of various mobile devices and influence of a variety of speech recognition software. However, for speech recognition for multi-language, lack of understanding of multi-language lexical model and limited capacity of systems interfere with the improvement of the recognition rate. It is not easy to embody speech expressed with multi-language into a single acoustic model and systems using several acoustic models lower speech recognition rate. In this regard, it is necessary to research and develop a multi-language speech recognition system in order to embody speech comprised of various languages into a single acoustic model. This paper studied a system that can recognize Korean and English as International Phonetic Language (IPA), based on the research for using a multi-language acoustic model in mobile devices. Focusing on finding an IPA model which satisfies both Korean and English phonemes, we get 94.8% of the voice recognition rate in Korean and 95.36% in English.
https://doi.org/10.5762/KAIS.2011.12.7.3267 인용 PDF KSCI

Recognition of Conducting Motion using HMM (HMM을 이용한 지휘 동작의 인식)

문형득;구자영
- Journal of the Korea Society of Computer and Information
- /
- v.9 no.1
- /
- pp.25-30
- /
- 2004
In this Paper, a beat recognition method from a sequence of images of conducting person was proposed. Hand position was detected using color discrimination, and symbolized by quantization. Then a motion of the conductor was represented as a sequence of symbols. HMM (Hidden Markov Model), which is excellent for recognition of sequence pattern with some level of variation, was used to recognize the sequence of symbols to be a motion for a beat.
PDF

Implementation of the Auditory Sense for the Smart Robot: Speaker/Speech Recognition (로봇 시스템에의 적용을 위한 음성 및 화자인식 알고리즘)

Jo, Hyun;Kim, Gyeong-Ho;Park, Young-Jin
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2007.05a
- /
- pp.1074-1079
- /
- 2007
We will introduce speech/speaker recognition algorithm for the isolated word. In general case of speaker verification, Gaussian Mixture Model (GMM) is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping (DTW) based template matching technique was proposed for the isolated word recognition in several years ago. We combine these two different concepts in a single method and then implement in a real time speaker/speech recognition system. Using our proposed method, it is guaranteed that a small number of reference speeches (5 or 6 times training) are enough to make reference model to satisfy 90% of recognition performance.
PDF

A study on the speech recognition by HMM based on multi-observation sequence (다중 관측열을 토대로한 HMM에 의한 음성 인식에 관한 연구)

정의봉
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.34S no.4
- /
- pp.57-65
- /
- 1997
The purpose of this paper is to propose the HMM (hidden markov model) based on multi-observation sequence for the isolated word recognition. The proosed model generates the codebook of MSVQ by dividing each word into several sections followed by dividing training data into several sections. Then, we are to obtain the sequential value of multi-observation per each section by weighting the vectors of distance form lower values to higher ones. Thereafter, this the sequential with high probability value while in recognition. 146 DDD area names are selected as the vocabularies for the target recognition, and 10LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments by way of the proposed model, for the comparison with it, the experiments by DP, MSVQ, and genral HMM are made with the same data under the same condition. The experiment results have shown that HMM based on multi-observation sequence proposed in this paper is proved superior to any other methods such as the ones using DP, MSVQ and general HMM models in recognition rate and time.
PDF

Implementation of Hidden Markov Model based Speech Recognition System for Teaching Autonomous Mobile Robot (자율이동로봇의 명령 교시를 위한 HMM 기반 음성인식시스템의 구현)

조현수;박민규;이민철
- 제어로봇시스템학회:학술대회논문집
- /
- 2000.10a
- /
- pp.281-281
- /
- 2000
This paper presents an implementation of speech recognition system for teaching an autonomous mobile robot. The use of human speech as the teaching method provides more convenient user-interface for the mobile robot. In this study, for easily teaching the mobile robot, a study on the autonomous mobile robot with the function of speech recognition is tried. In speech recognition system, a speech recognition algorithm using HMM(Hidden Markov Model) is presented to recognize Korean word. Filter-bank analysis model is used to extract of features as the spectral analysis method. A recognized word is converted to command for the control of robot navigation.
PDF

Efficient Continuous Vocabulary Clustering Modeling for Tying Model Recognition Performance Improvement (공유모델 인식 성능 향상을 위한 효율적인 연속 어휘 군집화 모델링)

Ahn, Chan-Shik;Oh, Sang-Yeob
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.1
- /
- pp.177-183
- /
- 2010
In continuous vocabulary recognition system by statistical method vocabulary recognition to be performed using probability distribution it also modeling using phoneme clustering for based sample probability parameter presume. When vocabulary search that low recognition rate problem happened in express vocabulary result from presumed probability parameter by not defined phoneme and insert phoneme and it has it's bad points of gaussian model the accuracy unsecure for one clustering modeling. To improve suggested probability distribution mixed gaussian model to optimized for based resemble Euclidean and Bhattacharyya distance measurement method mixed clustering modeling that system modeling for be searching phoneme probability model in clustered model. System performance as a result of represent vocabulary dependence recognition rate of 98.63%, vocabulary independence recognition rate of 97.91%.
https://doi.org/10.9708/jksci.2010.15.1.177 인용 PDF KSCI

Search Result 3,415, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)