• Title/Summary/Keyword: Automatic Phone Recognition

Search Result 29, Processing Time 0.035 seconds

Model based Stress Decision Method (모델 기반의 강세 판정 방법)

  • Kim, Woo-Il;Koh, Hoon;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.49-57
    • /
    • 2000
  • This paper proposes an effective decision method focused on evaluating the 'stress position'. Conventional methods usually extract the acoustic parameters and compare them to references in absolute scale, adversely producing unstable results as testing conditions change. To cope with environmental dependency, the proposed method is designed to be model-based and determines the stressed interval by making relative comparison over candidates. The stressed/unstressed models are then induced from normal phone models by adaptive training. The experimental results indicate that the proposed method is promising, and that it is useful for automatic detection of stress positions. The results also show that generating the stressed/unstressed model by adaptive training is effective.

  • PDF

Automatic Error Correction System for Erroneous SMS Strings (SMS 변형된 문자열의 자동 오류 교정 시스템)

  • Kang, Seung-Shik;Chang, Du-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.6
    • /
    • pp.386-391
    • /
    • 2008
  • Some spoken word errors that violate grammatical or writing rules occurs frequently in communication environments like mobile phone and messenger. These unexpected errors cause a problem in a language processing system for many applications like speech recognition, text-to-speech translation, and so on. In this paper, we proposed and implemented an automatic correction system of ill-formed words and word spacing errors in SMS sentences that has been the major errors of poor accuracy. We experimented three methods of constructing the word correction dictionary and evaluated the results of those methods. They are (1) manual construction of error words from the vocabulary list of ill-formed communication languages, (2) automatic construction of error dictionary from the manually constructed corpus, and (3) context-dependent method of automatic construction of error dictionary.

Study on the Improvement of Speech Recognizer by Using Time Scale Modification (시간축 변환을 이용한 음성 인식기의 성능 향상에 관한 연구)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.462-472
    • /
    • 2004
  • In this paper a method for compensating for thp performance degradation or automatic speech recognition (ASR) is proposed. which is mainly caused by speaking rate variation. Before the new method is proposed. quantitative analysis of the performance of an HMM-based ASR system according to speaking rate is first performed. From this analysis, significant performance degradation was often observed in the rapidly speaking speech signals. A quantitative measure is then introduced, which is able to represent speaking rate. Time scale modification (TSM) is employed to compensate the speaking rate difference between input speech signals and training speech signals. Finally, a method for compensating the performance degradation caused by speaking rate variation is proposed, in which TSM is selectively employed according to speaking rate. By the results from the ASR experiments devised for the 10-digits mobile phone number, it is confirmed that the error rate was reduced by 15.5% when the proposed method is applied to the high speaking rate speech signals.

Taking a Jump Motion Picture Automatically by using Accelerometer of Smart Phones (스마트폰 가속도계를 이용한 점프동작 자동인식 촬영)

  • Choi, Kyungyoon;Jun, Kyungkoo
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.633-641
    • /
    • 2014
  • This paper proposes algorithms to detect jump motion and automatically take a picture when the jump reaches its top. Based on the algorithms, we build jump-shot system by using accelerometer-equipped smart phones. Since the jump motion may vary depending on one's physical condition, gender, and age, it is critical to figure out common features which are independent from such differences. Also it is obvious that the detection algorithm needs to work in real-time because of the short duration of the jump. We propose two different algorithms considering these requirements and develop the system as a smart phone application. Through a series of experiments, we show that the system is able to successfully detect the jump motion and take a picture when it reaches the top.

Design of Smart Home Network System based on ZigBee Topology (ZigBee 토폴로지를 이용한 스마트 홈 네트워크 시스템 설계)

  • Liu, Dan;Kim, Gwang-Jun;Lee, Jin-Woo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.3
    • /
    • pp.537-543
    • /
    • 2012
  • Smart home System is shirt-sleeve, the automatic control systems, computer network system and network communication technology in the integration of network intelligent home control system. Intelligent household will let users have a more convenient means to management of domestic equipment, for example, through the house, wireless remote control, touch screen phone and Internet or speech recognition control household devices, more can perform scene operation, make more equipment form linkage. In this paper, we propose the intelligent household various kinds of equipment within each other can communication, do not need to user command according to different state interactive operation, thus to bring the greatest degree of user efficient and convenient, comfortable and safe.

Design and Implementation of E-mail Client based on Automatic Feeling Recognition (인간의 감정을 자동 인식하는 전자메일 클라이언트의 설계 및 구현)

  • Kim, Na-young;Lee, Sang-kon
    • The Journal of Korean Association of Computer Education
    • /
    • v.12 no.2
    • /
    • pp.61-75
    • /
    • 2009
  • Modern day people can easily use an e-mail client for general communication, because of using Internet and cellular phone. The mail client for the purpose of private and business affair, advertisement, news searching, and business letter is widely used and has side effects. People could send an important document via an electronic mail client. It is important to support an e-mail client intelligent. We think that many kinds of techniques of natural language processing must be provided in the client with human's emotion. We consider to design a new mail client with six kinds of senders' emotional information; delight, angry, sad feeling and message to express, manner of talking, a discomfort index etc. Before sending an e-mail, we suggest a user to correct a bad word because we do not want to feel bad to a receiver. We present a proper process of sending/receiving for users with a new designed e-mail clients.

  • PDF

Design of Smart Device Assistive Emergency WayFinder Using Vision Based Emergency Exit Sign Detection

  • Lee, Minwoo;Mariappan, Vinayagam;Mfitumukiza, Joseph;Lee, Junghoon;Cho, Juphil;Cha, Jaesang
    • Journal of Satellite, Information and Communications
    • /
    • v.12 no.1
    • /
    • pp.101-106
    • /
    • 2017
  • In this paper, we present Emergency exit signs are installed to provide escape routes or ways in buildings like shopping malls, hospitals, industry, and government complex, etc. and various other places for safety purpose to aid people to escape easily during emergency situations. In case of an emergency situation like smoke, fire, bad lightings and crowded stamped condition at emergency situations, it's difficult for people to recognize the emergency exit signs and emergency doors to exit from the emergency building areas. This paper propose an automatic emergency exit sing recognition to find exit direction using a smart device. The proposed approach aims to develop an computer vision based smart phone application to detect emergency exit signs using the smart device camera and guide the direction to escape in the visible and audible output format. In this research, a CAMShift object tracking approach is used to detect the emergency exit sign and the direction information extracted using template matching method. The direction information of the exit sign is stored in a text format and then using text-to-speech the text synthesized to audible acoustic signal. The synthesized acoustic signal render on smart device speaker as an escape guide information to the user. This research result is analyzed and concluded from the views of visual elements selecting, EXIT appearance design and EXIT's placement in the building, which is very valuable and can be commonly referred in wayfinder system.

Pronunciation Variation Patterns of Loanwords Produced by Korean and Grapheme-to-Phoneme Conversion Using Syllable-based Segmentation and Phonological Knowledge (한국인 화자의 외래어 발음 변이 양상과 음절 기반 외래어 자소-음소 변환)

  • Ryu, Hyuksu;Na, Minsu;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.139-149
    • /
    • 2015
  • This paper aims to analyze pronunciation variations of loanwords produced by Korean and improve the performance of pronunciation modeling of loanwords in Korean by using syllable-based segmentation and phonological knowledge. The loanword text corpus used for our experiment consists of 14.5k words extracted from the frequently used words in set-top box, music, and point-of-interest (POI) domains. At first, pronunciations of loanwords in Korean are obtained by manual transcriptions, which are used as target pronunciations. The target pronunciations are compared with the standard pronunciation using confusion matrices for analysis of pronunciation variation patterns of loanwords. Based on the confusion matrices, three salient pronunciation variations of loanwords are identified such as tensification of fricative [s] and derounding of rounded vowel [ɥi] and [$w{\varepsilon}$]. In addition, a syllable-based segmentation method considering phonological knowledge is proposed for loanword pronunciation modeling. Performance of the baseline and the proposed method is measured using phone error rate (PER)/word error rate (WER) and F-score at various context spans. Experimental results show that the proposed method outperforms the baseline. We also observe that performance degrades when training and test sets come from different domains, which implies that loanword pronunciations are influenced by data domains. It is noteworthy that pronunciation modeling for loanwords is enhanced by reflecting phonological knowledge. The loanword pronunciation modeling in Korean proposed in this paper can be used for automatic speech recognition of application interface such as navigation systems and set-top boxes and for computer-assisted pronunciation training for Korean learners of English.

A Collaborative Video Annotation and Browsing System using Linked Data (링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징 시스템)

  • Lee, Yeon-Ho;Oh, Kyeong-Jin;Sean, Vi-Sal;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.203-219
    • /
    • 2011
  • Previously common users just want to watch the video contents without any specific requirements or purposes. However, in today's life while watching video user attempts to know and discover more about things that appear on the video. Therefore, the requirements for finding multimedia or browsing information of objects that users want, are spreading with the increasing use of multimedia such as videos which are not only available on the internet-capable devices such as computers but also on smart TV and smart phone. In order to meet the users. requirements, labor-intensive annotation of objects in video contents is inevitable. For this reason, many researchers have actively studied about methods of annotating the object that appear on the video. In keyword-based annotation related information of the object that appeared on the video content is immediately added and annotation data including all related information about the object must be individually managed. Users will have to directly input all related information to the object. Consequently, when a user browses for information that related to the object, user can only find and get limited resources that solely exists in annotated data. Also, in order to place annotation for objects user's huge workload is required. To cope with reducing user's workload and to minimize the work involved in annotation, in existing object-based annotation automatic annotation is being attempted using computer vision techniques like object detection, recognition and tracking. By using such computer vision techniques a wide variety of objects that appears on the video content must be all detected and recognized. But until now it is still a problem facing some difficulties which have to deal with automated annotation. To overcome these difficulties, we propose a system which consists of two modules. The first module is the annotation module that enables many annotators to collaboratively annotate the objects in the video content in order to access the semantic data using Linked Data. Annotation data managed by annotation server is represented using ontology so that the information can easily be shared and extended. Since annotation data does not include all the relevant information of the object, existing objects in Linked Data and objects that appear in the video content simply connect with each other to get all the related information of the object. In other words, annotation data which contains only URI and metadata like position, time and size are stored on the annotation sever. So when user needs other related information about the object, all of that information is retrieved from Linked Data through its relevant URI. The second module enables viewers to browse interesting information about the object using annotation data which is collaboratively generated by many users while watching video. With this system, through simple user interaction the query is automatically generated and all the related information is retrieved from Linked Data and finally all the additional information of the object is offered to the user. With this study, in the future of Semantic Web environment our proposed system is expected to establish a better video content service environment by offering users relevant information about the objects that appear on the screen of any internet-capable devices such as PC, smart TV or smart phone.