• Title/Summary/Keyword: visual-audio

Search Result 424, Processing Time 0.032 seconds

Conformance Test for MPEG-4 Shape Decoders (MPEG-4 Shape Decoder의 적합성 검사)

  • 황혜전;박인수;박수현;이병욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.6B
    • /
    • pp.1060-1067
    • /
    • 2000
  • MPEG-4 visual coding is an object-based system. The current video coding standards, H.261, MPEG-1, and MPEG-2 encode frame by frame. On the other hand, MPEG-4 separately encodes several objects, such as video objects and audio objects, in the same frame. Each transmitted object is decoded and composed in one frame. Shape coding is a process of coding visual objects in a frame. In this paper we present conformance test method for MPEG-4 shape decoders. This paper reviews the basic shape decoding standard, and proposes conformance test methods for BAB type decoder, and CAE decoder for intra and inter VOPs. Our test generates all possible cases of shape motion vector difference and context.

  • PDF

System Design of High-Definition Media Transceiver based on Power Line Communication and Its Performance Analysis (전력선 통신 기반 HD급 미디어 전송 시스템 설계 및 성능 분석)

  • Kim, Ji-Hyoung;Kim, Kwan-Woong;Kim, Yong-K.
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.1
    • /
    • pp.192-196
    • /
    • 2010
  • Due to a development of a modem technology as Power Line Communication(PLC) over 200 Mbps, the high-speed multi-media data trasmission could be currently possible. The strength of the PLC has no more additional wiring work. PLC has also possible to high quality data transmission with currently electrical cable. It has a various strong point campare with existing wire and wireless communication technologies. In This paper we develop a high quality media transmitter-receiver based on merging the HomePlug AV, which is 200 Mbps class PLC technology and HDMI Interface technology. The video function was used for the VEDEO TEST GENERATOR in order to a property valuation. Smart Live 6 software were used for the assessment of audio property. As the result of measurement of the HD class images by capturing from the receiver of the PLC, the quality of images couldn't be confirm any deterioration, which has compared with original reflections. In case of audio part as the result of confirmation of the Phase, Magnitude, it has been confirmed that over 90% of nomal transmition and receiving of acoustic signal. It can be possible to have HD class Media service through the PLC.

Pilot Study for Analysis of TV Ads of Local Governments (지방자치단체 광고효용성에 대한 탐색적 연구: KTX 광고노출 환경을 중심으로)

  • Song, Seungyeol;Lim, Sang Guk;Kim, Jung Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.1
    • /
    • pp.43-49
    • /
    • 2020
  • Along side with the rapid growth of local governments' advertising bills, there are few studies focused on the effectiveness of these ads. Especially one of the media being used by the local governments is the Korea Express Train (KTX), where they advertise in the train coaches' KTX video monitor. Unfortunately the ads in KTX are exposed without audio mostly. The current study, therefore, probed on the effectiveness of these ads. This study utilized transportation theory and content analysis methodology to give insight to its discourse. We established two analysis units (camera and subtitles), and then analyzed 107 local government ads. From the camera analysis, it is observed that local governments' festival and tour promotion ads more often employ dynamic angles such as drone shot and long shot. Also, from subtitles usage analysis, it is observed that many of the ads make use of large size titles and subtitles which could prevent viewers seeing visual shots. In the special case audio-less KTX ads, this study recommends emphasis on subtitles which will enhance the ad effectiveness of the ad messages.

How of Improve an identity of mobile device interface and usability? (모바일 디바이스의 인터페이스 아이덴티티 개선 및 사용성 증대방안)

  • Song, Sang-Gon;Kim, Young-Sun;Choo, Hee-Jeong;Kang, Tae-Young;Hong, Noh-Kyung
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.140-145
    • /
    • 2008
  • Mobile device has a constraint such as a small physical display size and interaction. One of the most important issues in mobile devices is the express an identity of one's products company. It can be appeal to the user in side of consistency. Thus, we integrate and extract an identity element from user experience including, Graphical User Interface, Information Architecture and Audio User Interface. The study was conducted by a task force team with User Interface practitioners of managing divisions of each product. In this study, methods and processes that were attempted in order to establish consistency principles of user experiences, enhancing the various characteristics of each product, are described. The results and practical experiences obtained through the processes are introduced.

  • PDF

L2 Proficiency Effect on the Acoustic Cue-Weighting Pattern by Korean L2 Learners of English: Production and Perception of English Stops

  • Kong, Eun Jong;Yoon, In Hee
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.81-90
    • /
    • 2013
  • This study explored how Korean L2 learners of English utilize multiple acoustic cues (VOT and F0) in perceiving and producing the English alveolar stop with a voicing contrast. Thirty-four 18-year-old high-school students participated in the study. Their English proficiency level was classified as either 'high' (HEP) or 'low' (LEP) according to high-school English level standardization. Thirty different synthesized syllables were presented in audio stimuli by combining a 6-step VOTs and a 5-step F0s. The listeners judged how close the audio stimulus was to /t/ or /d/ in L2 using a visual analogue scale. The L2 /d/ and /t/ productions collected from the 22 learners (12 HEP, 10 LEP) were acoustically analyzed by measuring VOT and F0 at the vowel onset. Results showed that LEP listeners attended to the F0 in the stimuli more sensitively than HEP listeners, suggesting that HEP listeners could inhibit less important acoustic dimensions better than LEP listeners in their L2 perception. The L2 production patterns also exhibited a group-difference between HEP and LEP in that HEP speakers utilized their VOT dimension (primary cue in L2) more effectively than LEP speakers. Taken together, the study showed that the relative cue-weighting strategies in L2 perception and production are closely related to the learner's L2 proficiency level in that more proficient learners had a better control of inhibiting and enhancing the relevant acoustic parameters.

Development of Seismic Recorder for Long-term Observation of Microearthquakes (미소지진(微小地震) 장기관측(長期觀測)을 위한 지진기록계(地震記錄計)의 개발(開發))

  • Kim, Sung Kyun;Cho, Kyu Jang;Chung, Bu Heung;Moon, Chang Bae;Sin, In Chul;Sung, Rack Hoon
    • Economic and Environmental Geology
    • /
    • v.21 no.2
    • /
    • pp.185-191
    • /
    • 1988
  • A two channel seismic recorder suitable for long-term observation of microearthquakes is developed. The direct analogue recording on cassette tape is adopted in the recorder whose circuits of amplifier and mortor units of an audio cassette recorder are modified. The recorder provides contineous record of 10 days with DC 12V battery (100AH) and with standard cassette tape of 60 minute use. The binary coded time signals of date, hour, and minute are generated once a minute by the timing system and absolute time input using radio to measure the time drift is also possible. For the seismic signal processing, the analogue signals from audio cassette player pass A/D converter and digitized data are stored in personal computer. Then visual records can be obtained using computer graphic mode. Basic programs "ADCONVO" and "DRAWO" to accomplish A/D conversions, the creation of data files and visualization of signals were written. Some sample signals reproduced from the recorded tape are presented.

  • PDF

A Study on the Development of Web-based Full Motion Video E-mail System using MPEG-4 (웹을 기반으로 한 MPEG-4 동영상 E-mail 시스템의 개발)

  • 고재승
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.3
    • /
    • pp.283-294
    • /
    • 2002
  • Now is the time for web-based video e-mail system because of world wide use of internet. But video data is so large, then data compression is much needed for transmission by web. In this paper, my colleagues and I implement full motion video e-mail system using MPEG-4, the international standards for audio-visual data. This video e-mail system is made of web-based active-X control, so easily accessible by web, and applies real-time audio-video compression. It's possible for everyone to send video e-mail for free to everywhere in the world if this system is used. The main application areas of this system are multimedia mailing service, web-based video advertisement, remote education, remote medical service and shopping mall construction, etc.

  • PDF

A Study of 3D Sound Modeling based on Geometric Acoustics Techniques for Virtual Reality (가상현실 환경에서 기하학적 음향 기술 기반의 3차원 사운드 모델링 기술에 관한 연구)

  • Kim, Cheong Ghil
    • Journal of Satellite, Information and Communications
    • /
    • v.11 no.4
    • /
    • pp.102-106
    • /
    • 2016
  • With the popularity of smart phones and the help of high-speed wireless communication technology, high-quality multimedia contents have become common in mobile devices. Especially, the release of Oculus Rift opens a new era of virtual reality technology in consumer market. At the same time, 3D audio technology which is currently used to make computer games more realistic will soon be applied to the next generation of mobile phone and expected to offer a more expansive experience than its visual counterpart. This paper surveys concepts, algorithms, and systems for modeling 3D sound virtual environment applications. To do this, we first introduce an important design principle for audio rendering based on physics-based geometric algorithms and multichannel technologies, and introduce an audio rendering pipeline to a scene graph-based virtual reality system and a hardware architecture to model sound propagation.

Korean Emotional Speech and Facial Expression Database for Emotional Audio-Visual Speech Generation (대화 영상 생성을 위한 한국어 감정음성 및 얼굴 표정 데이터베이스)

  • Baek, Ji-Young;Kim, Sera;Lee, Seok-Pil
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.71-77
    • /
    • 2022
  • In this paper, a database is collected for extending the speech synthesis model to a model that synthesizes speech according to emotions and generating facial expressions. The database is divided into male and female data, and consists of emotional speech and facial expressions. Two professional actors of different genders speak sentences in Korean. Sentences are divided into four emotions: happiness, sadness, anger, and neutrality. Each actor plays about 3300 sentences per emotion. A total of 26468 sentences collected by filming this are not overlap and contain expression similar to the corresponding emotion. Since building a high-quality database is important for the performance of future research, the database is assessed on emotional category, intensity, and genuineness. In order to find out the accuracy according to the modality of data, the database is divided into audio-video data, audio data, and video data.

Design and Implementation of Emergency Recognition System based on Multimodal Information (멀티모달 정보를 이용한 응급상황 인식 시스템의 설계 및 구현)

  • Kim, Eoung-Un;Kang, Sun-Kyung;So, In-Mi;Kwon, Tae-Kyu;Lee, Sang-Seol;Lee, Yong-Ju;Jung, Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.2
    • /
    • pp.181-190
    • /
    • 2009
  • This paper presents a multimodal emergency recognition system based on visual information, audio information and gravity sensor information. It consists of video processing module, audio processing module, gravity sensor processing module and multimodal integration module. The video processing module and gravity sensor processing module respectively detects actions such as moving, stopping and fainting and transfer them to the multimodal integration module. The multimodal integration module detects emergency by fusing the transferred information and verifies it by asking a question and recognizing the answer via audio channel. The experiment results show that the recognition rate of video processing module only is 91.5% and that of gravity sensor processing module only is 94%, but when both information are combined the recognition result becomes 100%.