• Title/Summary/Keyword: Automatic Phone Recognition

Search Result 29, Processing Time 0.024 seconds

Smart Phone Road Signs Recognition Model Using Image Segmentation Algorithm

  • Huang, Ying;Song, Jeong-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.887-890
    • /
    • 2012
  • Image recognition is one of the most important research directions of pattern recognition. Image based road automatic identification technology is widely used in current society, the intelligence has become the trend of the times. This paper studied the image segmentation algorithm theory and its application in road signs recognition system. With the help of image processing technique, respectively, on road signs automatic recognition algorithm of three main parts, namely, image segmentation, character segmentation, image and character recognition, made a systematic study and algorithm. The experimental results show that: the image segmentation algorithm to establish road signs recognition model, can make effective use of smart phone system and application.

  • PDF

Automatic Recognition of Bank Security Card Using Smart Phone (스마트폰을 이용한 은행 보안카드 자동 인식)

  • Kim, Jin-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.12
    • /
    • pp.19-26
    • /
    • 2016
  • Among the various services for mobile banking, user authentication method using bank security card is still very useful. We can use mobile banking easily and safely in case of saving encoded security codes in smart phone and entering codes automatically whenever user authentication is required without bank security card. In this paper automatic recognition algorithm of security codes of bank security card is proposed in oder to enroll the encoded security codes into smart phone using smart phone camera. Advanced adaptive binarization is used for extracting digit segments from various background image pattern and adaptive 2-dimensional layout analysis method is developed for segmentation and recognition of damaged or touched digits. Experimental results of proposed algorithm using Android and iPhone, show excellent security code recognition results.

Analyzing vowel variation in Korean dialects using phone recognition

  • Jooyoung Lee;Sunhee Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.101-107
    • /
    • 2023
  • This study aims to propose an automatic method of detecting vowel variation in the Korean dialects of Gyeong-sang and Jeol-la. The method is based on error patterns extracted using phone recognition. Canonical and recognized phone sequences are compared, and statistical analyses distinguish the vowels appearing in both dialects, the dialect-common vowels, and the vowels with high mismatch rates for each dialect. The dialect-common vowels show monophthongization of diphthongs. The vowels unique to the dialects are /we/ to [e] and /ʌ/ to [ɰ] for Gyeong-sang dialect, and /ɰi/ to [ɯ] in Jeol-la dialect. These results corroborate previous dialectology reports regarding phonetic realization of the Korean dialects. The current method provides a possibility of automatic explanation of the dialect patterns.

Verification of Normalized Confidence Measure Using n-Phone Based Statistics

  • Kim, Byoung-Don;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
    • Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.123-134
    • /
    • 2005
  • Confidence measure (CM) is used for the rejection of mis-recognized words in an automatic speech recognition (ASR) system. Rahim, Lee, Juang and Cho's confidence measure (RLJC-CM) is one of the widely-used CMs [1]. The RLJC-CM is calculated by averaging phone-level CMs. An extension of the RLJC-CM was achieved by Kim et al [2]. They devised the normalized CM (NCM), which is a statistically normalized version of the RLJC-CM by using the tri-phone based CM normalization. In this paper we verify the NCM by generalizing tri-phone to n-phone unit. To apply various units for the normalization, mono-phone, tri-phone, quin-phone and $\infty$-phone are tested. By the experiments in the domain of the isolated word recognition we show that tri-phone based normalization is sufficient enough to enhance the rejection performance of the ASR system. Also we explain the NCM in regard to two class pattern classification problems.

  • PDF

A Phonetics Based Design of PLU Sets for Korean Speech Recognition (한국어 음성인식을 위한 음성학 기반의 유사음소단위 집합 설계)

  • Hong, Hye-Jin;Kim, Sun-Hee;Chung, Min-Hwa
    • MALSORI
    • /
    • no.65
    • /
    • pp.105-124
    • /
    • 2008
  • This paper presents the effects of different phone-like-unit (PLU) sets in order to propose an optimal PLU set for the performance improvement of Korean automatic speech recognition (ASR) systems. The examination of 9 currently used PLU sets indicates that most of them include a selection of allophones without any sufficient phonetic base. In this paper, a total of 34 PLU sets are designed based on Korean phonetic characteristics arid the effects of each PLU set are evaluated through experiments. The results show that the accuracy rate of each phone is influenced by different phonetic constraint(s) which determine(s) the PLU sets, and that an optimal PLU set can be anticipated through the phonetic analysis of the given speech data.

  • PDF

Comparison of Phone Boundary Alignment between Handlabels and Autolabels

  • Jang, Tae-Yeoub;Chung, Hyun-Song
    • Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.27-39
    • /
    • 2003
  • This study attempts to verify the reliability of automatically generated segment labels as compared to those obtained by conventional labelling by hand. First of all, an autolabeller is constructed using the standard HMM speech recognition technique. For evaluation, we compare the automatically generated labels with manually annotated labels for the same speech data. The comparison is performed by calculating the temporal difference between an autolabel boundary and its corresponding hand label boundary. When the mismatched duration between two labels falls within 10 msec, we consider the autolabel as correct. The results suggest that overall 78% of autolabels are correctly obtained. It is found that the boundary of obstruents is better aligned than that of sonorants and vowels. In case of stop sound classes, strong stops in manner-of-articulation wise and velar stops in place-of-articulation wise show better performance in boundary alignment. The result suggests that more phone-specific consideration is necessary to improve autosegmentation performance.

  • PDF

The Character Recognition System of Mobile Camera Based Image (모바일 이미지 기반의 문자인식 시스템)

  • Park, Young-Hyun;Lee, Hyung-Jin;Baek, Joong-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.5
    • /
    • pp.1677-1684
    • /
    • 2010
  • Recently, due to the development of mobile phone and supply of smart phone, many contents have been developed. Especially, since the small-sized cameras are equiped in mobile devices, people are interested in the image based contents development, and it also becomes important part in their practical use. Among them, the character recognition system can be widely used in the applications such as blind people guidance systems, automatic robot navigation systems, automatic video retrieval and indexing systems, automatic text translation systems. Therefore, this paper proposes a system that is able to extract text area from the natural images captured by smart phone camera. The individual characters are recognized and result is output in voice. Text areas are extracted using Adaboost algorithm and individual characters are recognized using error back propagated neural network.

Vowel Fundamental Frequency in Manner Differentiation of Korean Stops and Affricates

  • Jang, Tae-Yeoub
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.217-232
    • /
    • 2000
  • In this study, I investigate the role of post-consonantal fundamental frequency (F0) as a cue for automatic distinction of types of Korean stops and affricates. Rather than examining data obtained by restricting contexts to a minimum to prevent the interference of irrelevant factors, a relatively natural speaker independent speech corpus is analysed. Automatic and statistical approaches are adopted to annotate data, to minimise speaker variability, and to evaluate the results. In spite of possible loss of information during those automatic analyses, statistics obtained suggest that vowel F0 is a useful cue for distinguishing manners of articulation of Korean non-continuant obstruents having the same place of articulation, especially of lax and aspirated stops and affricates. On the basis of the statistics, automatic classification is attempted over the relevant consonants in a specific context where the micro-prosodic effects appear to be maximised. The results confirm the usefulness of this effect in application for Korean phone recognition.

  • PDF

Study on the Situational satisfaction survey of Smart Phone based on voice recognition technology (스마트 폰 음성 인식 서비스의 상황별 만족도 조사)

  • Lee, Yoon-jeong;Kim, Seung-In
    • Journal of Digital Convergence
    • /
    • v.15 no.8
    • /
    • pp.351-357
    • /
    • 2017
  • The purpose of this study is to analyze the relationship between users' expectation and satisfaction through analyzing smart phone voice recognition service and surveying the situational satisfaction of voice recognition service. First, the concept and current status of speech recognition service were investigated through literature research, and the questionnaire survey was carried out through questionnaires based on the second and third principles. As a result, the user is most likely to use the smartphone voice recognition service when making a telephone call. Also, it was found that the service is the most used at home and the service is used most when the hand is not available. Through these results, various methods such as personalization service, condition recognition function, automatic emergency recognition, unlocking by voice could be derived. Based on this research, it is expected that it will be used effectively for improving smartphone voice recognition service and developing wearable device in the future.

Modified Phonetic Decision Tree For Continuous Speech Recognition

  • Kim, Sung-Ill;Kitazoe, Tetsuro;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4E
    • /
    • pp.11-16
    • /
    • 1998
  • For large vocabulary speech recognition using HMMs, context-dependent subword units have been often employed. However, when context-dependent phone models are used, they result in a system which has too may parameters to train. The problem of too many parameters and too little training data is absolutely crucial in the design of a statistical speech recognizer. Furthermore, when building large vocabulary speech recognition systems, unseen triphone problem is unavoidable. In this paper, we propose the modified phonetic decision tree algorithm for the automatic prediction of unseen triphones which has advantages solving these problems through following two experiments in Japanese contexts. The baseline experimental results show that the modified tree based clustering algorithm is effective for clustering and reducing the number of states without any degradation in performance. The task experimental results show that our proposed algorithm also has the advantage of providing a automatic prediction of unseen triphones.

  • PDF