• Title/Summary/Keyword: audio visual

Search Result 426, Processing Time 0.028 seconds

MPEG-4 BIFS Optimization for Interactive T-DMB Content (지상파 DMB 컨텐츠의 MPEG-4 BIFS 최적화 기법)

  • Cha, Kyung-Ae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.1
    • /
    • pp.54-60
    • /
    • 2007
  • The Digital Multimedia Broadcasting(DMB) system is developed to offer high quality multimedia content to the mobile environment. The system adopts the MPEG-4 standard for the main video, audio and other media format. For providing interactive contents, it also adopts the MPEG-4 scene description that refers to the spatio-temporal specifications and behaviors of individual objects. With more interactive contents, the scene description also needs higher bitrate. However, the bandwidth for allocating meta data, such as scene description is restrictive in the mobile environment. On one hand, the DMB terminal renders each media stream according to the scene description. Thus the binary format for scene(BIFS) stream corresponding to the scene description should be decoded and parsed in advance when presenting media data. With this reasoning, the transmission delay of the BIFS stream would cause the delay in transmitting whole audio-visual scene presentations, although the audio or video streams are encoded in very low bitrate. This paper presents the effective optimization technique in adapting the BIFS stream into the expected bitrate without any waste in bandwidth and avoiding transmission delays inthe initial scene description for interactive DMB content.

  • PDF

Multi-modal Detection of Anchor Shot in News Video (다중모드 특징을 사용한 뉴스 동영상의 앵커 장면 검출 기법)

  • Yoo, Sung-Yul;Kang, Dong-Wook;Kim, Ki-Doo;Jung, Kyeong-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.12 no.4
    • /
    • pp.311-320
    • /
    • 2007
  • In this paper, an efficient detection algorithm of an anchor shot in news video is presented. We observed the audio visual characteristics of news video and proposed several low level features which are appropriate for detecting an anchor shot in news video. The overall structure of the proposed algorithm is composed of 3 stages: the pause detection, the audio cluster classification, and the matching with motion activity stage. We used the audio features as well as the motion feature in order to improve the indexing accuracy and the simulation results show that the performance of the proposed algorithm is quite satisfactory.

Study on Nutrition Education for Elementary Schools in the Kyungnam Area (경남 일부지역 초등학교의 영양교육 실시현황)

  • 윤현숙;노정숙;허은실
    • Korean Journal of Community Nutrition
    • /
    • v.5 no.1
    • /
    • pp.63-73
    • /
    • 2000
  • The purpose of this study was to investigate the status of nutrition education at elementary schools. A total of 226 elementary school teachers within Changwon and Milyang city participated in this study . The results of this study are as follows. The average score on a test of nutrition knowledge was 4.40 out of 10, and teachers of Milyang districts type scored significantly higher on nutrition knowledge than teacher of Changwon rural and Milyang rural districts. Only 9.0% of the teachers had nutrition education training. 64.1% of th total had teaching experience in nutrition, 91.0% of that was being taught as a part of physical education and home economics. The information source for nutrition education was mainly guide books and magazine and newspaper articles. Current nutrition education was being taught mainly by lecture(85.0%) but the preferred methods of teaching in nutrition education were small group discussion(44.3%), role-playing(22.9%) and lecture(21.4%). Audio visuals aids were used by 45.5% of the teachers and the most common of them were VTR(43.1%) chart(22.4%) as preferred audio visuals aids of them were VTR(71.9%) adn actual model(14.1%)

  • PDF

The Comparison Study between standardizations of Visual-Audial Sensibility (시청각 감성 지표에 관한 비교 연구)

  • 이동춘;윤훈용;이상도;부진후;심정훈;강재철;황성환
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 1999.11a
    • /
    • pp.348-351
    • /
    • 1999
  • 시청각 관련 지표 개발은 3차원 시청각 환경 제시기술, 시청각 감성을 활용한 Audio의 개발과 시청각 감성 측정기술 및 DB개발 등에 대하여 진행되었다. 3차원 시청각 환경 제시기술 개발은 VR 환경 제시 기술 개발과 모의 시뮬레이터를 통한 평가 단계 및 인간의 공간 인식 특성에 관한 연구로 이루어져 있다. 따라서 지표화 과정에서 VR 제시 시스템 관련 지표(3개), VR평가 지표(3개), 그리고 정보물(2개) 등 총 8개의 지표가 완성되었다. 시청각 감성을 활용한 Audio 개발과 시청각 감성 측정기술 및 DB개발에서는 시청각 감성에 대하여 주관적 평가 실시 후, 이를 이용한 제품개발 및 DB화하는 과정으로 구성되었으며, 각각의 연구물에 대하여 각각 6개와 13개의 지표가 완성되었다. 시청각 감성을 활용한 Audio 개발과 시청각 감성 측정기술 및 DB개발은 감성측정방법에서 제시 자극과 실험육법에서의 다소의 차이는 있었으나, 감성어휘 도출을 통한 SD척도법, 생체신호 측정, 자료처리방법 및 평가기준 등에서 유사성이 있었다. 따라서 각각의 연구물에 대한 지표 개발뿐만 아니라 지표간의 관련성을 비교ㆍ분석함으로써 체계화된 지표 표준화 과정이 필요한 것으로 보인다.

  • PDF

Multipoint multimedia communcation service in broadband ISDN part I: a conversational communcation on DAVID STB environment (광대역ISDN상의 다지점 멀티미디어 통신서비스 I부:DAVIC 표준 STB에서의 대화형 멀티미디어통신)

  • 황대환;이종형;박영덕;조규섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.23 no.4
    • /
    • pp.821-835
    • /
    • 1998
  • The Digital Audio-Visual Council(DAVIC) that was established to develop useful multimedia communication services has completed the specifications for providing on-demand services such as Movie on Demand(MoD), Teleshopping and accepting Internet service. And then they are proceeding the works to suport converstional communcation services like Plain Old Telecphone Service(POTS), Video telephone, Video teleconferencing. In this paper, we prpose an efficient terminal architecture which can provide conversational multimedia communication services on DAVIC Set-Top Box (STB) environments. To apply the implemented conversational terminal to the multipoint communication environment, we considered the factors of Qurlity of Services(QoS) that determine grade of conversational communication service. We also present the inter-working scheme and that system structure to satisfy QoS by using new MPEG video bridge which gurantees end to end delay requirements as major element of QoS for achieving the real time communication and does not accompany visual quality degradation.

  • PDF

Lip Reading Method Using CNN for Utterance Period Detection (발화구간 검출을 위해 학습된 CNN 기반 입 모양 인식 방법)

  • Kim, Yong-Ki;Lim, Jong Gwan;Kim, Mi-Hye
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.233-243
    • /
    • 2016
  • Due to speech recognition problems in noisy environment, Audio Visual Speech Recognition (AVSR) system, which combines speech information and visual information, has been proposed since the mid-1990s,. and lip reading have played significant role in the AVSR System. This study aims to enhance recognition rate of utterance word using only lip shape detection for efficient AVSR system. After preprocessing for lip region detection, Convolution Neural Network (CNN) techniques are applied for utterance period detection and lip shape feature vector extraction, and Hidden Markov Models (HMMs) are then used for the recognition. As a result, the utterance period detection results show 91% of success rates, which are higher performance than general threshold methods. In the lip reading recognition, while user-dependent experiment records 88.5%, user-independent experiment shows 80.2% of recognition rates, which are improved results compared to the previous studies.

New Interactive TV Service Model based on the MPEG-4 System

  • Kim, Jongho;Jechang Jeong
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.125-128
    • /
    • 2002
  • In this paper, a new interactive TV service model is proposed. The MPEG-4 system is specified for composing and managing various object streams including user interactions. The data broadcasting model supporting user interactions is designed using MPEG-4 system in our proposal. We evaluate possibility of proposed service model using simulation player. This player supports MPEG-2 TS which contains MPEG-2 video and AC-3 audio streams as a main service and MPEC-4 system data as interactive services as well as user specific EPG information, and XML data, etc as supplemetary services. The player also supports a multi-channel environment. The synchronization between audio and visual data is achieved by DTS and PTS in TS.

  • PDF

Implementation of the Broadcasting System for Digital Media Contents (디지털 미디어 콘텐츠 방송 시스템 구현)

  • Shin, Jae-Heung;Kim, Hong-Ryul;Lee, Sang-Cheal
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.57 no.10
    • /
    • pp.1883-1887
    • /
    • 2008
  • Most of digital media contents are composed with video and audio, picture and animation informations. Sometime, there is some deviation of information recognition quality for the video and audio information according to information receiver's characteristics or the understanding. But visual information using the text provide most clear and accurate ways for information recognition to human being. In this paper, we propose a new broadcasting system(BSDMC) to transmit clear and accurate meaning of the digital media contents. We implement general-purpose components to display the video, picture, text and symbol simultaneously. Only plug-in and call these components with proper parameters on the application developing tool, we can easily develop the multimedia contents broadcasting system. These components are implemented based on the object-oriented framework and modular structure so that increase the reusability and can be develop other applications quick and reliable.

Audio-Based Human-Robot Interaction Technology (오디오 기반 인간로봇 상호작용 기술)

  • Kwak, K.C.;Kim, H.J.;Bae, K.S.;Yoon, H.S.
    • Electronics and Telecommunications Trends
    • /
    • v.22 no.2 s.104
    • /
    • pp.31-37
    • /
    • 2007
  • 인간로봇 상호작용 기술(human-robot interaction)은 다양한 의사소통 채널인 로봇카메라, 마이크로폰, 기타 센서를 통해 인지 및 정서적으로 상호작용할 수 있도록 로봇시스템 및 상호작용 환경을 디자인하고 구현 및 평가하는 지능형 서비스 로봇의 핵심기술이다. 본 고에서는 오디오 기반 인간로봇 상호작용 기술 중에서 음원 추적(sound localization)과 화자인식(speaker recognition) 기술의 국내외 기술동향을 살펴보고 최근 ETRI 지능형로봇연구단에서 상용화를 추진중인 시청각 기반 음원 추적(audio visual sound localization)과 문장독립 화자인식(text-independent speaker recognition)기술들을 다룬다. 또한 이들 기술들을 가정환경에서 효과적으로 사용하기 위해 음성인식, 얼굴검출, 얼굴인식 등을 결합한 시나리오에 대해서 살펴본다.

A Link Layer Design for DisplayPort Interface

  • Jin, Hyun-Bae;Yoon, Kwang-Hee;Kim, Tae-Ho;Jang, Ji-Hoon;Song, Byung-Cheol;Kang, Jin-Ku
    • Journal of IKEEE
    • /
    • v.14 no.4
    • /
    • pp.297-304
    • /
    • 2010
  • This paper presents a link layer design of DisplayPort interface with a state machine based on packet processing. The DisplayPort link layer provides isochronous video/audio transport service, link service, and device service. The merged video, audio main link, and AUX channel controller are implemented with 7,648 LUTs(Loop Up Tables), 6020 register, and 821,760 of block memory bits synthesized using a FPGA board and it operates at 203.32MHz.