• Title/Summary/Keyword: Object-based Audio

Search Result 63, Processing Time 0.022 seconds

Image Enhancement Techniques for MPEG-4 (MPEG-4 영상의 화질 개선에 관한 연구)

  • 김태근;신정호;백준기
    • Journal of Broadcast Engineering
    • /
    • v.2 no.2
    • /
    • pp.169-181
    • /
    • 1997
  • In this paper, we propose and discuss about image enhancement techniques for MPEG-4. which represents very low bit-rate, content-based. and object-based hierarchical audio-visual coding standard. The proposed enhancement technique removes undesired artifacts arising in the compression procedure and increase resolution in both spatial and temporal domains. In order to remove undesired artifacts. we divide the MPEG-4 video algorithm in two parts: MPEG-2 like part and the new part. For removing artifacts caused by the first part. we adopt the conventional blocking artifacts algorithm developed for MPEG-2. On the other hand for removing artifacts caused by the second part. we provide a new degradation model. and propose the corresponding image restoration method. For increasing resolution of the MPEG-4 images, we propose a general framework of multichannel image interpolation process. which includes both spatial and temporal interpolations. As the MPEG-4 standard is under development. various sophisticated techniques are considered. but research on image enhancement techniques is relatively underestimated. By this reason. additional image enhancement techniques will become very important issue in realization phase of MPEG-4.

  • PDF

A study on searching image by cluster indexing and sequential I/O (연속적 I/O와 클러스터 인덱싱 구조를 이용한 이미지 데이타 검색 연구)

  • Kim, Jin-Ok;Hwang, Dae-Joon
    • The KIPS Transactions:PartD
    • /
    • v.9D no.5
    • /
    • pp.779-788
    • /
    • 2002
  • There are many technically difficult issues in searching multimedia data such as image, video and audio because they are massive and more complex than simple text-based data. As a method of searching multimedia data, a similarity retrieval has been studied to retrieve automatically basic features of multimedia data and to make a search among data with retrieved features because exact match is not adaptable to a matrix of features of multimedia. In this paper, data clustering and its indexing are proposed as a speedy similarity-retrieval method of multimedia data. This approach clusters similar images on adjacent disk cylinders and then builds Indexes to access the clusters. To minimize the search cost, the hashing is adapted to index cluster. In addition, to reduce I/O time, the proposed searching takes just one I/O to look up the location of the cluster containing similar object and one sequential file I/O to read in this cluster. The proposed schema solves the problem of multi-dimension by using clustering and its indexing and has higher search efficiency than the content-based image retrieval that uses only clustering or indexing structure.

Content Based Video Retrieval by Example Considering Context (문맥을 고려한 예제 기반 동영상 검색 알고리즘)

  • 박주현;낭종호;김경수;하명환;정병희
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.12
    • /
    • pp.756-771
    • /
    • 2003
  • Digital Video Library System which manages a large amount of multimedia information requires efficient and effective retrieval methods. In this paper, we propose and implement a new video search and retrieval algorithm that compares the query video shot with the video shots in the archives in terms of foreground object, background image, audio, and its context. The foreground object is the region of the video image that has been changed in the successive frames of the shot, the background image is the remaining region of the video image, and the context is the relationship between the low-level features of the adjacent shots. Comparing these features is a result of reflecting the process of filming a moving picture, and it helps the user to submit a query focused on the desired features of the target video clips easily by adjusting their weights in the comparing process. Although the proposed search and retrieval algorithm could not totally reflect the high level semantics of the submitted query video, it tries to reflect the users' requirements as much as possible by considering the context of video clips and by adjusting its weight in the comparing process.

Online Monitoring System based notifications on Mobile devices with Kinect V2 (키넥트와 모바일 장치 알림 기반 온라인 모니터링 시스템)

  • Niyonsaba, Eric;Jang, Jong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.6
    • /
    • pp.1183-1188
    • /
    • 2016
  • Kinect sensor version 2 is a kind of camera released by Microsoft as a computer vision and a natural user interface for game consoles like Xbox one. It allows acquiring color images, depth images, audio input and skeletal data with a high frame rate. In this paper, using depth image, we present a surveillance system of a certain area within Kinect's field of view. With computer vision library(Emgu CV), if an object is detected in the target area, it is tracked and kinect camera takes RGB image to send it in database server. Therefore, a mobile application on android platform was developed in order to notify the user that Kinect has sensed strange motion in the target region and display the RGB image of the scene. User gets the notification in real-time to react in the best way in the case of valuable things in monitored area or other cases related to a reserved zone.

MPEG-4 based XMT APIs for Scene Description (장면 기술을 위한 MPEG-4 기반 XMT API 구현)

  • 정예선;김규헌;기명석
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2001.11b
    • /
    • pp.91-94
    • /
    • 2001
  • MPEG-4 시스템은 장면 자체를 하나의 구성 요소로 여기는 기존의 시스템과는 달리, 그 장면을 구성하는 부호화 또는 복호화된 A/V 객체(Audio/visual Objects)들을 하나의 단위로 인식하여, 다양한 멀티미디어 컨텐츠의 장면을 구성(Scene Composition)하고 표현 하는 것에 그 특징이 있다. 이러한 MPEG-4 시스템의 객체 기반 특징은 다양한 사용자와의 대화성(Interactivity)을 가능하게 하며 , 또한 편리한 컨텐츠 편집 및 재사용 등이 가능하기에 차세대 디지털 방송 컨텐츠 제작에 중요하게 활용될 전망이다. 객체 기반 A/V 편집 도구는 MPEG-4를 기반으로 차세대 디지털 방송 컨텐츠 제작을 용이하게 하기 위한 제작/편집 도구로써 , 장면을 표현하기 위하여 BIFS(Binary Format for Scene description)와 XMT(eXtensible MPEG-4 Textual format) 포맷을 모두 사용하고 있다. BIFS 포맷은 저작된 결과물을 바이너리 형태로 표현하기 때문에, 저작된 결과물을 전송하는 데에는 용이하나, 중간에 저작된 결과물을 확인하기 어렵고, 또한 기존의 다른 어플리케이션과의 상호 작용(Interoperability)과 교환(Exchange)에도 어려움이 따른다. 이에 반해, XMT는 차세대 마크업 언어로 각광 받고 있는 XML 에 그 기반을 두고 있기에 저작된 결과물을 제작자가 쉽게 저작물을 이해할 수 있으며, SMIL 과 X3D 같은 다른 어플리케이션과의 상호작용과 교환 또한 용이하게 한다 XMT는 기술 방법에 따라 XMT-A 와 XMT-0 두 가지 형태가 있으며, XMT-A 포맷은 VRML에서 발전한 X3D(extensible 3D)를 바탕으로 MPEG-4 시스템의 특징들을 수용하여 구성되고 BIFS와 일대일로 대응된다. 반면에 XMT-0는 멀티미디어 문서를 웹문서로 표현하는 SMIL 2.0 을 그 기반으로 하였기에 MPEG-4 시스템의 특징보다는 컨텐츠를 저작하는 제작자의 초점에 맞추어 개발된 형태이다. XMT를 이용하여 컨텐츠를 저작하기 위해서는 사용자 인터페이스를 통해 입력되는 저작 정보들을 손쉽게 저장하고 조작할 수 있으며, 또한 XMT 파일 형태로 출력하기 위한 API 가 필요하다. 이에, 본 논문에서는 XMT 형태의 중간 자료형으로의 저장 및 조작을 위하여 XML 에서 표준 인터페이스로 사용하고 있는 DOM(Document Object Model)을 기반으로 하여 XMT 문법에 적합하게 API를 정의하였으며, 또한, XMT 파일을 생성하기 위한 API를 구현하였다. 본 논문에서 제공된 API는 객체기반 제작/편집 도구에 응용되어 다양한 멀티미디어 컨텐츠 제작에 사용되었다.

  • PDF

Investigation of Indicator Kriging for Evaluating Proper Rock Mass Classification based on Electrical Resistivity and RMR Correlation Analysis (RMR과 전기비저항의 상관성 해석에 기초하여 지시크리깅을 적용한 최적 암반 분류 기법 고찰)

  • Lee, Kyung-Ju;Ha, Hee-Sang;Ko, Kwang-Buem;Kim, Ji-Soo
    • Tunnel and Underground Space
    • /
    • v.19 no.5
    • /
    • pp.407-420
    • /
    • 2009
  • In this study geostatistical technique using indicator kriging was performed to evaluate the optimal rock mass classification by integrating the various geophysical information such as borehole data and geophysical data. To get the optimal kriging result, it is necessary to devise the suitable technique to integrate the hard (borehole) and soft (geophysical) data effectively. Also, the model parameters of the variogram must be determined as a priori procedure. Iterative non-linear inversion method was implemented to determine the model parameters of theoretical variogram. To verify the algorithm, behaviour of object function and precision of convergence were investigated, revealing that gradient of the range is extremely small. This algorithm for the field data was applied to a mountainous area planned for a large-scale tunneling construction. As for a soft data, resistivity information from AMT survey is incorporated with RMR information from borehole data, a sort of hard data. Finally, RMR profiles were constructed and attempted to be interpreted at the tunnel elevation and the upper 1D level.

멀티미디어 서비스의 환경변화 및 COSMOS 멀티미디어 운영체제

  • 송동호;임영환
    • Information and Communications Magazine
    • /
    • v.11 no.6
    • /
    • pp.37-54
    • /
    • 1994
  • Technical innovation on multimedia data processing brings us new multimedia services. Multimedia services are classified into five groups : TVs, computers, telecommunications, periperals, and softwares. This paper surveys on the services in various aspects and, in particular, computer areas are discussed in detail. To provide the services, major subsystems such as highspeed networks, operating systems, intelligent agent based user interfaces are discussed. In particular, multimedia operating systems are the most actively investigating research area as an infrastructure of multimedia computer systems to provide integrated multimedia services. So, the trends of new multimedia operating systems are analyzed and COSMOS (Collaborative Object Sharing for Multimedia Operating System) multimedia group presentation is discussed. The characteristics, model and abstract data structure of COSMOS is described. The performance analysis of 3 person conference system using audio, video and shared graphic editor on COSMOS shows that taking integrated multimedia operating system approach leads changing of new multimedia service environments.

  • PDF

Real-time monitoring system with Kinect v2 using notifications on mobile devices (Kinect V2를 이용한 모바일 장치 실시간 알림 모니터링 시스템)

  • Eric, Niyonsaba;Jang, Jong Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.05a
    • /
    • pp.277-280
    • /
    • 2016
  • Real-time remote monitoring system has an important value in many surveillance situations. It allows someone to be informed of what is happening in his monitoring locations. Kinect v2 is a new kind of camera which gives computers eyes and can generate different data such as color and depth images, audio input and skeletal data. In this paper, using Kinect v2 sensor with its depth image, we present a monitoring system in a space covered by Kinect. Therefore, based on space covered by Kinect camera, we define a target area to monitor using depth range by setting minimum and maximum distances. With computer vision library (Emgu CV), if there is an object tracked in the target space, kinect camera captures the whole image color and sends it in database and user gets at the same time a notification on his mobile device wherever he is with internet access.

  • PDF

A Study for the Accessibility of Camera-Based Mobile Applications on Touch Screen Devices for Blind People (스마트기기에서 시각장애인을 위한 카메라기반 인식 소프트웨어 인터페이스의 접근성 연구)

  • Choi, Yoonjung;Hong, Ki-Hyung
    • Journal of the HCI Society of Korea
    • /
    • v.7 no.2
    • /
    • pp.49-56
    • /
    • 2012
  • The camera-based mobile applications such as color, pattern and object reading can improve the living quality of blind people. However currently available camera-based applications are uncomfortable for the blind, since these applications do not reflect accessibility requirements of the blind especially on touch screen. We investigated accessibility requirements about rapidly growing camera-based mobile applications on touch screen devices for the blind. In order to identify accessibility requirements, we conducted a usability testing for color reading applications with three different types of interfaces on Android OS. The results of the usability testing were as follows: (1) users preferred short depth of menu hierarchy, (2) the initial audio help was more useful than just-in-time help, (3) users needed both manual and automatic camera shooting modes although they preferred manual to automatic mode, (4) users wanted the OS supported screen reader function to be turned off during the color reading application was running, and (5) users required tactile feedback to identify touch screen boundary. We designed a new user interface for blind people by applying the identified accessibility requirements. From a usability testing of the new user interface with 10 blind people, we showed that the identified accessibility requirements were very useful accessibility guidelines for camera-based mobile applications.

  • PDF

Development of a Web-based Presentation Attitude Correction Program Centered on Analyzing Facial Features of Videos through Coordinate Calculation (좌표계산을 통해 동영상의 안면 특징점 분석을 중심으로 한 웹 기반 발표 태도 교정 프로그램 개발)

  • Kwon, Kihyeon;An, Suho;Park, Chan Jung
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.2
    • /
    • pp.10-21
    • /
    • 2022
  • In order to improve formal presentation attitudes such as presentation of job interviews and presentation of project results at the company, there are few automated methods other than observation by colleagues or professors. In previous studies, it was reported that the speaker's stable speech and gaze processing affect the delivery power in the presentation. Also, there are studies that show that proper feedback on one's presentation has the effect of increasing the presenter's ability to present. In this paper, considering the positive aspects of correction, we developed a program that intelligently corrects the wrong presentation habits and attitudes of college students through facial analysis of videos and analyzed the proposed program's performance. The proposed program was developed through web-based verification of the use of redundant words and facial recognition and textualization of the presentation contents. To this end, an artificial intelligence model for classification was developed, and after extracting the video object, facial feature points were recognized based on the coordinates. Then, using 4000 facial data, the performance of the algorithm in this paper was compared and analyzed with the case of facial recognition using a Teachable Machine. Use the program to help presenters by correcting their presentation attitude.