• Title/Summary/Keyword: Visual speech recognition

Search Result 68, Processing Time 0.025 seconds

A Real-time Bus Arrival Notification System for Visually Impaired Using Deep Learning (딥 러닝을 이용한 시각장애인을 위한 실시간 버스 도착 알림 시스템)

  • Seyoung Jang;In-Jae Yoo;Seok-Yoon Kim;Youngmo Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.2
    • /
    • pp.24-29
    • /
    • 2023
  • In this paper, we propose a real-time bus arrival notification system using deep learning to guarantee movement rights for the visually impaired. In modern society, by using location information of public transportation, users can quickly obtain information about public transportation and use public transportation easily. However, since the existing public transportation information system is a visual system, the visually impaired cannot use it. In Korea, various laws have been amended since the 'Act on the Promotion of Transportation for the Vulnerable' was enacted in June 2012 as the Act on the Movement Rights of the Blind, but the visually impaired are experiencing inconvenience in using public transportation. In particular, from the standpoint of the visually impaired, it is impossible to determine whether the bus is coming soon, is coming now, or has already arrived with the current system. In this paper, we use deep learning technology to learn bus numbers and identify upcoming bus numbers. Finally, we propose a method to notify the visually impaired by voice that the bus is coming by using TTS technology.

  • PDF

Consideration for cognitive effects in smart environments for effective UXD(User eXperience Design) (스마트환경의 효과적인 UXD를 위한 인지작용 고찰)

  • Lee, Chang Wook;Chung, Jean-Hun
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.397-405
    • /
    • 2013
  • The development of the technology of the 21st century, wireless Internet technology development in smart environments, was rapidly settled. In such an environment, the user is faced with many smart devices and smart content. This study is the analysis of the smart environment and smart devices, and user-to-user cognitive out about the effects reported. Cognitive effects observed behavior, technology, and user-centered system design, and plays a very important role to play in educating the users. And theoretical consideration about the UX (User eXperience) and UXD (User eXperience Design), by case analysis on the technical aspects of 'effective' visual aspect of interoperation aspects (interaction), and the cognitive effects of UXD (User eXperience Design) examined. As a result, on the visual aspects of the user experience based on the design that can be used to know, and be sound or through interaction with the user of the machine-to-machine interaction (and interaction) that must be provided, such as location-based or speech recognition technology will help you through the convenience of the user. Through this research, the smart environment and helping act of understanding, effective UXD (User eXperience Design) to take advantage of to help.

The impact of Digital Video Effects and subtitles on evaluation and agenda recognition in TV News (TV뉴스의 어깨걸이와 자막이 뉴스에 대한 평가와 의제 인식에 미치는 영향)

  • Bae, Jin-Ah
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.465-473
    • /
    • 2017
  • An experiment was conducted to investigate the relationship between the DVEs and subtitles provided with anchor speech in TV news and the news evaluation, trust and agenda recognition. 120 university students were asked to watch four types of news that differed in the contents of their DVEs and subtitles, and then they evaluated the fairness, sensibility, and irritability of the news. The content of DVEs and subtitles were related to the irritability evaluation and trust of the news, and it was not related with the fairness and sensibility evaluation. When the DVEs and subtitles emphasizing the stimulating aspects of the issue were given, the irritability was highly evaluated and the trust was low. The evaluation on news fairness affected the news trust. The more they rated news as fair, the greater the trust in news. Unlike the assumption in this study, DVEs and subtitles were not the main factors influencing the news agenda perception, and the viewers tended to perceive the agenda based on the news content itself.

Design of CNN-based Braille Conversion and Voice Output Device for the Blind (시각장애인을 위한 CNN 기반의 점자 변환 및 음성 출력 장치 설계)

  • Seung-Bin Park;Bong-Hyun Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.3
    • /
    • pp.87-92
    • /
    • 2023
  • As times develop, information becomes more diverse and methods of obtaining it become more diverse. About 80% of the amount of information gained in life is acquired through the visual sense. However, visually impaired people have limited ability to interpret visual materials. That's why Braille, a text for the blind, appeared. However, the Braille decoding rate of the blind is only 5%, and as the demand of the blind who want various forms of platforms or materials increases over time, development and product production for the blind are taking place. An example of product production is braille books, which seem to have more disadvantages than advantages, and unlike non-disabled people, it is true that access to information is still very difficult. In this paper, we designed a CNN-based Braille conversion and voice output device to make it easier for visually impaired people to obtain information than conventional methods. The device aims to improve the quality of life by allowing books, text images, or handwritten images that are not made in Braille to be converted into Braille through camera recognition, and designing a function that can be converted into voice according to the needs of the blind.

Contextual In-Video Advertising Using Situation Information (상황 정보를 활용한 동영상 문맥 광고)

  • Yi, Bong-Jun;Woo, Hyun-Wook;Lee, Jung-Tae;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.8
    • /
    • pp.3036-3044
    • /
    • 2010
  • With the rapid growth of video data service, demand to provide advertisements or additional information with regard to a particular video scene is increasing. However, the direct use of automated visual analysis or speech recognition on videos virtually has limitations with current level of technology; the metadata of video such as title, category information, or summary does not reflect the content of continuously changing scenes. This work presents a new video contextual advertising system that serves relevant advertisements on a given scene by leveraging the scene's situation information inferred from video scripts. Experimental results show that the use of situation information extracted from scripts leads to better performance and display of more relevant advertisements to the user.

Expansion of Sensibility Area and Industrial Application in the Convergence Era - With Special Reference to Analysis of the Internet Arts of Sommerer and Mignonneau - (컨버전스시대 감성영역의 확장과 산업활용 -Sommerer와 Mignonneau의 인터넷 아트 분석을 중심으로-)

  • Kim, Hee-Young;Lee, Yong-Jae
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.146-154
    • /
    • 2010
  • Recently 'convergence' and 'communication' have been keywords in many areas. Artists and engineers have begun to communicate each other through collaboration based on new technologies. One of the exemplary technologies of this era of convergence is a technology of fusing five senses used by both Internet Art and industrial technologies such as car navigation systems and the iPhone. Sommerer and Mignonneau's Internet Art $\ll$Riding the Net$\gg$,$\ll$The Living Room$\gg$, and $\ll$The Living Web$\gg$ implement the Internet and the five-sense fusion technology to translate not only sound into visual images but also tactile senses into tempo-spatial representations. Likewise, industrial technologies such as car navigation systems and the iPhone employ the five-sense fusion technology of speech recognition, which leads to the expansion of the realm of senses in technology as seen in Internet Art. As examined in this study, the development of art and technology through their convergence will open up a new dimension of digital art and culture technology industry.

Development and validation of a Korean Affective Voice Database (한국형 감정 음성 데이터베이스 구축을 위한 타당도 연구)

  • Kim, Yeji;Song, Hyesun;Jeon, Yesol;Oh, Yoorim;Lee, Youngmee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.77-86
    • /
    • 2022
  • In this study, we reported the validation results of the Korean Affective Voice Database (KAV DB), an affective voice database available for scientific and clinical use, comprising a total of 113 validated affective voice stimuli. The KAV DB includes audio-recordings of two actors (one male and one female), each uttering 10 semantically neutral sentences with the intention to convey six different affective states (happiness, anger, fear, sadness, surprise, and neutral). The database was organized into three separate voice stimulus sets in order to validate the KAV DB. Participants rated the stimuli on six rating scales corresponding to the six targeted affective states by using a 100 horizontal visual analog scale. The KAV DB showed high internal consistency for voice stimuli (Cronbach's α=.847). The database had high sensitivity (mean=82.8%) and specificity (mean=83.8%). The KAV DB is expected to be useful for both academic research and clinical purposes in the field of communication disorders. The KAV DB is available for download at https://kav-db.notion.site/KAV-DB-75 39a36abe2e414ebf4a50d80436b41a.

Increase of Spoken Number of Syllables Using MIT(Melody Intonation Therapy) : Case Studies on older adult with stroke and aphasia (MIT(Melodic Intonation Therapy) 중심의 음악활동을 이용한 실어증을 가진 뇌졸중 노인의 음절 수 증가에 대한 사례 연구)

  • Hong, Do Kyoung
    • Journal of Music and Human Behavior
    • /
    • v.2 no.2
    • /
    • pp.57-67
    • /
    • 2005
  • Most of stroke patients have not only physical difficulty but speech and neurological disorder because of hemiplegia and such unexpected changes cause psychologic disadaptability and absent-mindedness. Particularly, lowering of physical ability can lead to serious emotional problem from failure or frustration in daily life. Generally, treatment of patient with stroke put emphasis on physical rehabilitation but actually this patient had considerable speech disorder such as aphasia or articulation disorder. Moreover, failing of recognition function, mental disorder as hypochondria, and even visual and auditory disorder are represented. So it is effective to integrate verbal remediation and other treatments in medical care environment. In particular, patients with language disorder very often wither psychologically therefore it is efficient to use of music therapy that gives opulent emotion to aphasia patients. And primarily to investigate the effects of 10 sessions treatments; change in spoken total number of syllables, to confirm their own value by success of given task and reassure about themselves ability. All of 10 sessions stages were scored by MIT manual and its improvement were measured, that is, accomplishment was analyzed within each level in order to prove detail change of spoken total number of syllables. The result of this program organized from 2 syllables to 4 syllables is summarized as follows. Subject A completed in preliminary stage Level I, in 2 syllables case advanced to Level III in fifth session and to Level IV in seventh session, in 3 syllables case advanced to Level III in seventh session and to Level IV in ninth session, and in 4 syllables case showed 8% low success rate in first session but after repeated practice increased considerably in sixth session and in advanced to Level III in eighth session to Level IV in tenth session. Subject B also completed in preliminary stage Level I, in 2 syllables case advanced to Level III in forth session and to Level IV in sixth session, in 3 syllables case advanced to Level III in fifth session and to Level IV in seventh session, and in 4 syllables case showed 10% low success rate in first session and increased considerably in fifth session and in advanced to Level III in seventh session but could not reach to Level IV until tenth session. As a result, it was shown that music therapy using MIT was not statistically meaningful but improved spoken total number of syllables and success rate of task had improved as a whole. Therefore, music intervention using MIT it has positive affect on verbal ability of patients with Broca's Aphasia and their language rehabilitation.

  • PDF