• Title/Summary/Keyword: Speech-Recognition

Search Result 2,051, Processing Time 0.028 seconds

Design of Smart Home Network System based on ZigBee Topology (ZigBee 토폴로지를 이용한 스마트 홈 네트워크 시스템 설계)

  • Liu, Dan;Kim, Gwang-Jun;Lee, Jin-Woo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.3
    • /
    • pp.537-543
    • /
    • 2012
  • Smart home System is shirt-sleeve, the automatic control systems, computer network system and network communication technology in the integration of network intelligent home control system. Intelligent household will let users have a more convenient means to management of domestic equipment, for example, through the house, wireless remote control, touch screen phone and Internet or speech recognition control household devices, more can perform scene operation, make more equipment form linkage. In this paper, we propose the intelligent household various kinds of equipment within each other can communication, do not need to user command according to different state interactive operation, thus to bring the greatest degree of user efficient and convenient, comfortable and safe.

Study on the multi-functional Cradle by Voice Recognitions (다기능성을 가진 음성 인식 요람 연구)

  • Park, Kwang-Sung;Ahn, Sang-jin;Cho, Kyeong-Rok;Choi, Si-On;Park, Yong-Wook
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.4
    • /
    • pp.701-706
    • /
    • 2017
  • In this study, existing remote control or the cradle manually drives to recognize the voice of the way and through the app the Cradle to work with a motor. In addition, the temperature and humidity sensor was mounted in the cradle, the temperature and humidity of the cradle can be checked through the LCD. Depending on the sound size of the sound sensor, the resulting value was used to indicate a value of a, b, c, and the sum of the results over 1150, the cradle was recognized as the baby's crying, then, notificate and alarm on app.

Measurement and Analysis of Arousal While Experiencing Light-Field Display Device

  • Choi, Hyun-Jun;Kim, Noo-Ree;Park, Hyun-Rin
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.3
    • /
    • pp.188-193
    • /
    • 2020
  • In this paper, we examine whether the 3D image experience through a light-field display device showed the difference in the arousal of the user compared with the 2D image experience. For our experiment, the Looking GlassTM (LG) was used as a lightfield display device that provided 3D images, and 2D images were provided by digital and printed images. The subject's facial behavior during each media experience was recorded for analysis and the degree of arousal was measured by FaceReaderTM. As a result, the first image presented in the first order among the three kinds of images showed that there was a statistical difference in the degree of arousal between the three media. However, no significant differences were found between the three media in the other images. This may be because the arousal did not increase from the experience of the second image through the LG, owing to habituation. In conclusion, the 3D imaging experience may appear in the beginning, but does not continue.

An Architecture for Mobile Instruction: Application to Mathematics Education through the Web

  • Kim, Steven H.;Kwon, Oh-Nam;Kim, Eun-Jung
    • Research in Mathematical Education
    • /
    • v.4 no.1
    • /
    • pp.45-55
    • /
    • 2000
  • The rapid proliferation of wireless networks provides a ubiquitous channel for delivering instructional materials at the convenience of the user. By delivering content through portable devices linked to the Internet, the full spectrum of multimedia capabilities is available for engaging the user's interest. This capability encompasses not only text but images, video, speech generation and voice recognition. Moreover, the incorporation of machine learning capabilities at the source provides the ability to tailor the material to the general level of expertise of the user as well as the immediate needs of the moment: for instance, a request for information regarding a particular city might be covered by a leisurely presentation if solicited from the home, but more tersely if the user happens to be driving a car. This paper presents system architecture to support mobile instruction in conjunction with knowledge-based tutoring capabilities. For concreteress, the general concepts are examined in the context of a system for mathematics education on the Web.

  • PDF

The development of the anomia assessment battery based on the psycholinguistic processing (언어심리학을 기반으로 한 명칭성 실어증 평가도구 개발)

  • Jung, Jae-Bum;Pyun, Sung-Bom;Sohn, Hyo-Jung;Gee, Sung-Woo;Cho, Sung-Ho;Nam, Ki-Chun
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.158-162
    • /
    • 2007
  • Anomia, word finding difficulty, is one of the most common feature in aphasia. Previous studies support that the process of picture naming consists of three stages, in the order of the object recognition, semantic, and phonological output stages. Anomic patients have many symptoms and it means that anomia can be sub-divided into several symptom groups. Our anomia assessment battery consists of several parts: (1) picture naming set, (2) picture-word matching task, (3) lexical decision task for mental lexicon damage, (4) naming task for phonological lexicon damage, and (5) semantic decision task. Pictures and words were selected on the basis of usage frequency, semantic category, and word length. We administered this anomia evaluation battery to many anomic aphasics and we subdivided patients into several groups. We hope that our anomia evaluation set is useful and helpful for evaluation anomic aphasics

  • PDF

Design and Implementation of Mobile Communication System for Hearing- impaired Person (청각 장애인을 위한 모바일 통화 시스템 설계 및 구현)

  • Yun, Dong-Hee;Kim, Young-Ung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.5
    • /
    • pp.111-116
    • /
    • 2016
  • According to the Ministry of Science, ICT and Future Planning's survey of information gap, smartphone retention rate of disabled people stayed in one-third of non-disabled people, the situation is significantly less access to information for people with disabilities than non-disabled people. In this paper, we develop an application, CallHelper, that helps to be more convenient to use mobile voice calls to the auditory disabled people. CallHelper runs automatically when a call comes in, translates caller's voice to text output on the mobile screen, and displays the emotion reasoning from the caller's voice to visualize emoticons. It also saves voice, translated text, and emotion data that can be played back.

A clustering algorithm of statistical langauge model and its application on speech recognition (통계적 언어 모델의 clustering 알고리즘과 음성인식에의 적용)

  • Kim, Woo-Sung;Koo, Myoung-Wan
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.145-152
    • /
    • 1996
  • 연속음성인식 시스템을 개발하기 위해서는 언어가 갖는 문법적 제약을 이용한 언어모델이 요구된다. 문법적 규칙을 이용한 언어모델은 전문가가 일일이 문법 규칙을 만들어 주어야 하는 단점이 있다. 통계적 언어 모델에서는 문법적인 정보를 수작업으로 만들어 주지 않는 대신 그러한 모든 정보를 학습을 통해서 훈련해야 하기 때문에 이를 위해 요구되는 학습 데이터도 엄청나게 증가한다. 따라서 적은 양의 데이터로도 이와 유사한 효과를 보일 수 있는 것이 클래스에 의거한 언어 모델이다. 또 이 모델은 음성 인식과 연계시에 탐색 공간을 줄여 주기 때문에 실시간 시스템 구현에 매우 유용한 모델이다. 여기서는 자동으로 클래스를 찾아주는 알고리즘을 호텔예약시스템의 corpus에 적용, 분석해 보았다. Corpus 자체가 문법규칙이 뚜렷한 특성을 갖고 있기 때문에 heuristic하게 클래스를 준 것과 유사한 결과를 보였지만 corpus 크기가 커질 경우에는 매우 유용할 것이며, initial map을 heuristic하게 주고 그 알고리즘을 적용한 결과 약간의 성능향상을 볼 수 있었다. 끝으로 음성인식시스템과 접합해 본 결과 유사한 결과를 얻었으며 언어모델에도 음향학적 특성을 반영할 수 있는 연구가 요구됨을 알 수 있었다.

  • PDF

Post-Processing of Speech Recognition Using Phonological Variables and Improved Edit-distance (발음 변이와 개선된 편집 거리를 이용한 음성 인식 후처리)

  • Kim, Yejin;Park, Youngmin;Kang, Sangwoo;Jung, Sangkeon;Lee, Cheongjae;Seo, Jungyun
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.9-12
    • /
    • 2014
  • 본 논문에서는 오인식된 고유명사의 후처리 방법을 제안한다. 최근 음성 인식 후처리를 위해 통계적 방법을 이용하는 연구가 활발히 진행되어 왔다. 하지만 고유명사의 음성 인식 후처리는 대용량의 데이터 수집에 많은 비용이 필요하므로 통계적 방법을 효과적으로 적용하기 어렵다. 따라서 본 논문에서는 발음 변이 현상을 고려하여 편집 거리 알고리즘을 개선한 기법을 제안한다. 본 논문에서는 고유명사의 음성 오인식 교정 성능을 검증하였고, 그 결과 P@3의 결과가 비교 모델보다 55%의 성능 향상률을 보였다.

  • PDF

Optimal Design of a MEMS-type Piezoelectric Microphone (MEMS 구조 압전 마이크로폰의 최적구조 설계)

  • Kwon, Min-Hyeong;Ra, Yong-Ho;Jeon, Dae-Woo;Lee, Young-Jin
    • Journal of Sensor Science and Technology
    • /
    • v.27 no.4
    • /
    • pp.269-274
    • /
    • 2018
  • High-sensitivity signal-to-noise ratio (SNR) microphones are essentially required for a broad range of automatic speech recognition applications. Piezoelectric microphones have several advantages compared to conventional capacitor microphones including high stiffness and high SNR. In this study, we designed a new piezoelectric membrane structure by using the finite elements method (FEM) and an optimization technique to improve the sensitivity of the transducer, which has a high-quality AlN piezoelectric thin film. The simulation demonstrated that the sensitivity critically depends on the inner radius of the top electrode, the outer radius of the membrane, and the thickness of the piezoelectric film in the microphone. The optimized piezoelectric transducer structure showed a much higher sensitivity than that of the conventional piezoelectric transducer structure. This study provides a visible path to realize micro-scale high-sensitivity piezoelectric microphones that have a simple manufacturing process, wide range of frequency and low DC bias voltage.

ACOUSTIC FEATURES DIFFERENTIATING KOREAN MEDIAL LAX AND TENSE STOPS

  • Shin, Ji-Hye
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.53-69
    • /
    • 1996
  • Much research has been done on the rues differentiating the three Korean stops in word initial position. This paper focuses on a more neglected area: the acoustic cues differentiating the medial tense and lax unaspirated stops. Eight adult Korean native speakers, four males and four females, pronounced sixteen minimal pairs containing the two series of medial stops with different preceding vowel qualities. The average duration of vowels before lax stops is 31 msec longer than before their tense counterparts (70 msec for lax vs 39 msec for tense). In addition, the average duration of the stop closure of tense stops is 135 msec longer than that of lax stops (69 msec for lax vs 204msec for tense). THESE DURATIONAL DIFFERENCES ARE 50 LARGE THAT THEY MAY BE PHONOLOGICALLY DETERMINED, NOT PHONETICALLY. Moreover, vowel duration varies with the speaker's sex. Female speakers have 5 msec shorter vowel duration before both stops. The quality of voicing, tense or lax, is also a cue to these two stop types, as it is in initial position, but the relative duration of the stops appears to be much more important cues. The duration of stops changes the stop perception while that of preceding vowel does not. The consequences of these results for the phonological description of Korean as well as the synthesis and automatic recognition of Korean will be discussed.

  • PDF