• Title/Summary/Keyword: Audio-Visual Information

Search Result 207, Processing Time 0.028 seconds

New Interactive TV Service Model based on the MPEG-4 System

  • Kim, Jongho;Jechang Jeong
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.125-128
    • /
    • 2002
  • In this paper, a new interactive TV service model is proposed. The MPEG-4 system is specified for composing and managing various object streams including user interactions. The data broadcasting model supporting user interactions is designed using MPEG-4 system in our proposal. We evaluate possibility of proposed service model using simulation player. This player supports MPEG-2 TS which contains MPEG-2 video and AC-3 audio streams as a main service and MPEC-4 system data as interactive services as well as user specific EPG information, and XML data, etc as supplemetary services. The player also supports a multi-channel environment. The synchronization between audio and visual data is achieved by DTS and PTS in TS.

  • PDF

Design and Implementation of HTML5 based Authoring Tool for Audio-Visual Book (HTML5 기반스토리텔링 비디오북 저작 툴의 설계/구현)

  • Kim, Tae-hyun;Shim, Jae-Youn;Kim, Seong-Whan
    • Annual Conference of KIPS
    • /
    • 2013.11a
    • /
    • pp.1442-1445
    • /
    • 2013
  • Portable devices are growth such as smart phone and smart pad and e-book market is also growing, but existing publishers have some problem of conversion paper book to e-book with technology. In this paper propose to solve this problem using HTML5 for E-book making tool. HTML5 is new version of HTML. It can be use video and audio element. Also a lot of web Browser can be use HTML5 format. If we have HTML5 compatible browser devices, we can use anywhere and anytime. We propose video book making tool using these HTML5 characteristics, it is available to support of text or images.

MPEG-4 BIFS Optimization for Interactive T-DMB Content (지상파 DMB 컨텐츠의 MPEG-4 BIFS 최적화 기법)

  • Cha, Kyung-Ae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.1
    • /
    • pp.54-60
    • /
    • 2007
  • The Digital Multimedia Broadcasting(DMB) system is developed to offer high quality multimedia content to the mobile environment. The system adopts the MPEG-4 standard for the main video, audio and other media format. For providing interactive contents, it also adopts the MPEG-4 scene description that refers to the spatio-temporal specifications and behaviors of individual objects. With more interactive contents, the scene description also needs higher bitrate. However, the bandwidth for allocating meta data, such as scene description is restrictive in the mobile environment. On one hand, the DMB terminal renders each media stream according to the scene description. Thus the binary format for scene(BIFS) stream corresponding to the scene description should be decoded and parsed in advance when presenting media data. With this reasoning, the transmission delay of the BIFS stream would cause the delay in transmitting whole audio-visual scene presentations, although the audio or video streams are encoded in very low bitrate. This paper presents the effective optimization technique in adapting the BIFS stream into the expected bitrate without any waste in bandwidth and avoiding transmission delays inthe initial scene description for interactive DMB content.

  • PDF

Development of Audio-visual Aids of Death Education for Hospice Patients and Their Families (호스피스 환자와 가족을 위한 임종교육 시청각 자료 개발)

  • Seo, Mi-Suk;Kang, Yu Jung;Yoon, Ji Yoon;Kim, Tae Yeon;Cho, Hye Jun;Park, So Yeon;Lee, Si Yeon;Jang, Ji Hye;Kim, Yu Jin;Kang, Mi Teum
    • Journal of Hospice and Palliative Care
    • /
    • v.19 no.3
    • /
    • pp.240-248
    • /
    • 2016
  • Purpose: Patients and their caretakers need to understand various problems and requirements in the dying process so that they may prepare for death for the rest of their remaining life. Accordingly, a systematic audio-visual resource was developed to educate hospice patients and their families at the palliative care ward about the process of dying. Methods: For the development of an audio-visual resource, a initial education material was produced in the form of simple and accessible Power Point handouts based on literature study. Then, the program was completed through five rounds of a process, including expert advice, revision, update and evaluation. Results: The final version of the program was filmed with cooperation of the medical literature information division. Using the program, patients and families were educated through five phases over three sessions for a total 26 minutes and 34 seconds. Conclusion: The significance of this study lies in the fact that it was conducted after the establishment of the palliative care ward, which made it easier for nurses provide the education. It is expected that the program may be used by hospice specialists as well as nurses as an education resource for hospice patients and their families.

A Survey on the Application Possibility of Mass Media for Environmental Education (대중매체의 환경교육적 활용 가능성에 관한 고찰)

  • Lee, Jae-Yeong;Kim, In-Ho;Lee, Seon-Gyeong
    • Hwankyungkyoyuk
    • /
    • v.9 no.1
    • /
    • pp.30-38
    • /
    • 1996
  • The purpose of this study was to survey on the awareness of teachers and students to mass media as a source for school environmental education. This study was performed with the questionnaire to 179 teachers who participated in certificate in-service training for $\ulcorner$Environment$\lrcorner$subject and to 635 students(primary: 177, middle: 179, high school students: 279). The results derived from this study were as follows: First, most teachers(86.6%) evaluated that mass media's effects on students were high and positive in terms of school environmental education, thus they thought that the application necessity and possibility of mass media for environmental education were so too. Second, many teachers evaluated that more program related with environment had to be produced(57.0%) and disseminated, and information on them had to be apprised teachers to activate school environmental education(44.1%). Third, both teachers(87.1%) and students(70.4%) evaluated that audio-visual media such as television, video, movie was better than others for environmental education because audio-visual media could be more realistic and dynamic(T: 48.0%, S: 41.7%). Fourth, we found that as the result of statistical analysis, students's friendliness. credibility and preference on media were different to school classes. But we could not analize the relationship between factors for the limit of sample.

  • PDF

Lip and Voice Synchronization Using Visual Attention (시각적 어텐션을 활용한 입술과 목소리의 동기화 연구)

  • Dongryun Yoon;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.166-173
    • /
    • 2024
  • This study explores lip-sync detection, focusing on the synchronization between lip movements and voices in videos. Typically, lip-sync detection techniques involve cropping the facial area of a given video, utilizing the lower half of the cropped box as input for the visual encoder to extract visual features. To enhance the emphasis on the articulatory region of lips for more accurate lip-sync detection, we propose utilizing a pre-trained visual attention-based encoder. The Visual Transformer Pooling (VTP) module is employed as the visual encoder, originally designed for the lip-reading task, predicting the script based solely on visual information without audio. Our experimental results demonstrate that, despite having fewer learning parameters, our proposed method outperforms the latest model, VocaList, on the LRS2 dataset, achieving a lip-sync detection accuracy of 94.5% based on five context frames. Moreover, our approach exhibits an approximately 8% superiority over VocaList in lip-sync detection accuracy, even on an untrained dataset, Acappella.

Implementation of an Intelligent Audio Graphic Equalizer System (지능형 오디오 그래픽 이퀄라이저 시스템 구현)

  • Lee Kang-Kyu;Cho Youn-Ho;Park Kyu-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.76-83
    • /
    • 2006
  • A main objective of audio equalizer is for user to tailor acoustic frequency response to increase sound comfort and example applications of audio equalizer includes large-scale audio system to portable audio such as mobile MP3 player. Up to now, all the audio equalizer requires manual setting to equalize frequency bands to create suitable sound quality for each genre of music. In this paper, we propose an intelligent audio graphic equalizer system that automatically classifies the music genre using music content analysis and then the music sound is boosted with the given frequency gains according to the classified musical genre when playback. In order to reproduce comfort sound, the musical genre is determined based on two-step hierarchical algorithm - coarse-level and fine-level classification. It can prevent annoying sound reproduction due to the sudden change of the equalizer gains at the beginning of the music playback. Each stage of the music classification experiments shows at least 80% of success with complete genre classification and equalizer operation within 2 sec. Simple S/W graphical user interface of 3-band automatic equalizer is implemented using visual C on personal computer.

Multipoint multimedia communcation service in broadband ISDN part I: a conversational communcation on DAVID STB environment (광대역ISDN상의 다지점 멀티미디어 통신서비스 I부:DAVIC 표준 STB에서의 대화형 멀티미디어통신)

  • 황대환;이종형;박영덕;조규섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.4
    • /
    • pp.821-835
    • /
    • 1998
  • The Digital Audio-Visual Council(DAVIC) that was established to develop useful multimedia communication services has completed the specifications for providing on-demand services such as Movie on Demand(MoD), Teleshopping and accepting Internet service. And then they are proceeding the works to suport converstional communcation services like Plain Old Telecphone Service(POTS), Video telephone, Video teleconferencing. In this paper, we prpose an efficient terminal architecture which can provide conversational multimedia communication services on DAVIC Set-Top Box (STB) environments. To apply the implemented conversational terminal to the multipoint communication environment, we considered the factors of Qurlity of Services(QoS) that determine grade of conversational communication service. We also present the inter-working scheme and that system structure to satisfy QoS by using new MPEG video bridge which gurantees end to end delay requirements as major element of QoS for achieving the real time communication and does not accompany visual quality degradation.

  • PDF

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

Design and Implementation of Emergency Recognition System based on Multimodal Information (멀티모달 정보를 이용한 응급상황 인식 시스템의 설계 및 구현)

  • Kim, Eoung-Un;Kang, Sun-Kyung;So, In-Mi;Kwon, Tae-Kyu;Lee, Sang-Seol;Lee, Yong-Ju;Jung, Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.181-190
    • /
    • 2009
  • This paper presents a multimodal emergency recognition system based on visual information, audio information and gravity sensor information. It consists of video processing module, audio processing module, gravity sensor processing module and multimodal integration module. The video processing module and gravity sensor processing module respectively detects actions such as moving, stopping and fainting and transfer them to the multimodal integration module. The multimodal integration module detects emergency by fusing the transferred information and verifies it by asking a question and recognizing the answer via audio channel. The experiment results show that the recognition rate of video processing module only is 91.5% and that of gravity sensor processing module only is 94%, but when both information are combined the recognition result becomes 100%.