• Title/Summary/Keyword: Audio Information Retrieval

Search Result 73, Processing Time 0.034 seconds

A Study on the Effective Utilization of Media for Open Education (열린교육을 위한 열린매체의 활용에 관한 연구)

  • Joo, Young-Ju
    • Journal of the Korean Institute of Educational Facilities
    • /
    • v.5 no.3
    • /
    • pp.107-115
    • /
    • 1998
  • Open education is more relevant to the current educational reality which requires the liberalization, individualization and creativeness, and the effectiveness of open education will be maximized with the full utilization of instructional media. As well known, there are many different types of instructional media to promote open education such as print material, audio material, still picture, movie, computer, and multimedia. The main criteria to choose effective instructional media for open education depend upon easiness of supply and retrieval of information, and promotion of more frequent interaction among participants. In addition, utilization method, cost, curriculum contents, as well as school culture are also elements to consider in the selection of right instructional media.

  • PDF

Application of Speech Recognition with Closed Caption for Content-Based Video Segmentations

  • Son, Jong-Mok;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.135-142
    • /
    • 2005
  • An important aspect of video indexing is the ability to segment video into meaningful segments, i.e., content-based video segmentation. Since the audio signal in the sound track is synchronized with image sequences in the video program, a speech signal in the sound track can be used to segment video into meaningful segments. In this paper, we propose a new approach to content-based video segmentation. This approach uses closed caption to construct a recognition network for speech recognition. Accurate time information for video segmentation is then obtained from the speech recognition process. For the video segmentation experiment for TV news programs, we made 56 video summaries successfully from 57 TV news stories. It demonstrates that the proposed scheme is very promising for content-based video segmentation.

  • PDF

Content-based Music Retrieval using TIP-indexing Techniques and Features of Audio files (TIP-인덱싱 기법과 오디오 화일의 특징계수를 이용한 내용기반 음악 검색)

  • Kim Young-In
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2006.05a
    • /
    • pp.201-204
    • /
    • 2006
  • 최근에 내용기반 음악 정보 검색시스템과 관련하여 많은 연구들이 수행되고 있다. 이러한 노력의 결과로 자연스러운 음악 정보 검색을 위한 오디오 데이터를 이용한 내용기반 검색 방법에 대한 연구가 활발히 진행되고 있으며, 이러한 시스템에서는 대량의 음악특징 계수를 검색에 사용하고 있다. 하지만, 대량의 연속된 특징 계수를 저장 및 검색하는 방법으로 제안된 TIP-인덱스 화일을 이용한 연구는 부족한 실정이다. 본 논문에서는 연속 특징 계수를 효율적으로 인덱싱하는 기법의 하나인 TIP-인덱스 화일을 이용한 음악정보 검색 방법을 제안하고, 다양한 장르의 음악 오디오 화일에서 특징 계수를 추출하여 TIP-인덱스를 구축하여 실험하였으며, 실험 결과를 통하여 제안한 방법이 음악 정보 검색에서 좋은 성능을 보일 수 있음을 제시하였다.

  • PDF

MultiFormat motion picture storage subsystem using DirectShow Filters for a Mutichannel Visual Monitoring System (다채널 영상 감시 시스템을 위한 다중 포맷 동영상 저장 DirectShow Filter설계 및 구현)

  • 정연권;하상석;정선태
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.113-116
    • /
    • 2002
  • Windows provides Directshow for efficient multimedia streaming processings such as multimedia capture, storage, display and etc. Presently, many motion picture codecs and audio codecs are made to be used in Directshow framework and Windows also supports many codecs (MPEG4, H,263, WMV, WMA, ASF, etc.) in addition to a lot of useful tools for multimedia streaming processing. Therefore, Directshow can be effectively utilized for developing windows-based multimedia streaming applications such as visual monitoring systems which needs to store real-time video data for later retrieval. In this paper, we present our efforts for developing a Directshow Filter System supporting storage of motion pictures in various motion picture codecs. Our Directshow Filter system also provides an additional functionality of motion detection.

  • PDF

XML Repository System Using DBMS and IRS

  • Kang, Hyung-Il;Yoo, Jae-Soo;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.3
    • /
    • pp.6-14
    • /
    • 2007
  • In this paper, we design and implement a XML Repository System(XRS) that exploits the advantages of DBMSs and IRSs. Our scheme uses BRS to support full text indexing and content-based queries efficiently, and ORACLE to store XML documents, multimedia data, DTD and structure information. We design databases to manage XML documents including audio, video, images as well as text. We employ the non-composition model when storing XML documents into ORACLE. We represent structured information as ETID(Element Type Id), SORD(Sibling ORDer) and SSORD(Same Sibling ORDer). ETID is a unique value assigned to each element of DTD. SORD and SSORD represent an order information between sibling nodes and an order information among the sibling nodes with the same element respectively. In order to show superiority of our XRS, we perform various experiments in terms of the document loading time, document extracting time and contents retrieval time. It is shown through experiments that our XRS outperforms the existing XML document management systems. We also show that it supports various types of queries through performance experiments.

Design and Implementation of Multimedia Data Retrieval System using Image Caption Information (영상 캡션 정보를 이용한 멀티미디어 데이터 검색 시스템의 설계 및 구현)

  • 이현창;배상현
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.3
    • /
    • pp.630-636
    • /
    • 2004
  • According to the increase of audio and video data utilization, the presentation of multimedia data contents and the work of retrieving, storing and manipulating a multimedia data have been the focus of recent work. The display for multimedia data should retrieve and access the contents easily that users want to present. This study is about the design and implementation of a system to retrieve multimedia data based on the contents of documentation or the caption information of a multimedia data for retrieving documentation including multimedia data. It intends to develop an filtering step to retrieve all of keyword within the caption information of multimedia data and text of a documentation. Also, the system is designed to retrieve a large amount of data quickly using an inverted file structure available for B+ tree.

Rough Set-Based Approach for Automatic Emotion Classification of Music

  • Baniya, Babu Kaji;Lee, Joonwhoan
    • Journal of Information Processing Systems
    • /
    • v.13 no.2
    • /
    • pp.400-416
    • /
    • 2017
  • Music emotion is an important component in the field of music information retrieval and computational musicology. This paper proposes an approach for automatic emotion classification, based on rough set (RS) theory. In the proposed approach, four different sets of music features are extracted, representing dynamics, rhythm, spectral, and harmony. From the features, five different statistical parameters are considered as attributes, including up to the $4^{th}$ order central moments of each feature, and covariance components of mutual ones. The large number of attributes is controlled by RS-based approach, in which superfluous features are removed, to obtain indispensable ones. In addition, RS-based approach makes it possible to visualize which attributes play a significant role in the generated rules, and also determine the strength of each rule for classification. The experiments have been performed to find out which audio features and which of the different statistical parameters derived from them are important for emotion classification. Also, the resulting indispensable attributes and the usefulness of covariance components have been discussed. The overall classification accuracy with all statistical parameters has recorded comparatively better than currently existing methods on a pair of datasets.

Speaker Tracking Using Eigendecomposition and an Index Tree of Reference Models

  • Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
    • ETRI Journal
    • /
    • v.33 no.5
    • /
    • pp.741-751
    • /
    • 2011
  • This paper focuses on online speaker tracking for telephone conversations and broadcast news. Since the online applicability imposes some limitations on the tracking strategy, such as data insufficiency, a reliable approach should be applied to compensate for this shortage. In this framework, a set of reference speaker models are used as side information to facilitate online tracking. To improve the indexing accuracy, adaptation approaches in eigenvoice decomposition space are proposed in this paper. We believe that the eigenvoice adaptation techniques would help to embed the speaker space in the models and hence enrich the generality of the selected speaker models. Also, an index structure of the reference models is proposed to speed up the search in the model space. The proposed framework is evaluated on 2002 Rich Transcription Broadcast News and Conversational Telephone Speech corpus as well as a synthetic dataset. The indexing errors of the proposed framework on telephone conversations, broadcast news, and synthetic dataset are 8.77%, 9.36%, and 12.4%, respectively. Using the index tree structure approach, the run time of the proposed framework is improved by 22%.

A Study on Music Retrieval method based on Audio Contents Feature Analysis (오디오 멜로디 추출 기반 특징 분석을 이용한 음악검색 방법에 관한 연구)

  • Song, Chai-Jong;Lee, Sek-Phil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.441-443
    • /
    • 2011
  • 본 논문은 오디오 특징 분석을 기반으로 한 음악검색 방법에 대한 기술과 연구에 대한 내용이다. 본 연구에서는 크게 3가지의 주요 알고리즘을 이용하여 다 성음에서의 오디오 특징을 추출하고 3가지의 각자 다른 방식의 매칭 알고리즘을 기반으로 한 퓨전 매칭 방식을 제안한다. 오디오 특징으로는 메인 멜로디, 음악 구조를 분석한 세그먼테이션 정보를 이용한다. 본 연구에서 사용된 음악 DB는 음악 포털 서비스에서 제공하는 장르를 기반으로 한 8가지 장르에서 다양한 범위에서 2000곡을 선곡하였다. 오디오 특징 추출을 위한 알고리즘 개발과 매칭 알고리즘 개발을 위하여 음악 DB 2000곡 중 장르의 비율을 고려하여 100곡을 선정하고, 24명으로부터 1200개의 허밍을 녹음하였다. 24명중 3명은 대학에서 음악을 전공하고 나머지는 음악적 교육을 받은 경험이 없는 사람들이다. 1200개의 허밍을 분석한 결과 전체 허밍 중 60%정도가 노래의 시작 부분을 허밍하거나 노래를 불렀고, 30%정도는 하이라이트 부분을 허밍 하였다. 나머지 10%정도는 자신이 가장 자신 있는 부분을 불렀다. 이러한 분석 결과를 기반으로 가장 중요한 부분은 노래가 시작되는 부분에서의 멜로디를 정확하게 찾아내는 것이 무엇보다 중요하다는 것이다. 본 연구에서 검색결과의 평가는 MRR를 이용하여 측정하였다. MIDI DB를 사용한 경우가 다 성음에서 직접 멜로디를 추출한 경우보다 약간 성능이 우수하게 나왔으나 그 차이는 미미했다. 본 연구에서는 개발된 알고리즘을 이용하여 PC상에서 사용할 수 있는 클라이언트 프로그램과 Android app를 개발하였다.

Energy and Statistical Filtering for a Robust Audio Fingerprinting System (강인한 오디오 핑거프린팅 시스템을 위한 에너지와 통계적 필터링)

  • Jeong, Byeong-Jun;Kim, Dae-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.5
    • /
    • pp.1-9
    • /
    • 2012
  • The popularity of digital music and smart phones led to develope noise-robust real-time audio fingerprinting system in various ways. In particular, The Multiple Hashing(MLH) of fingerprint algorithms is robust to noise and has an elaborate structure. In this paper, we propose a filter engine based on MLH to achieve better performance. In this approach, we compose a energy-intensive filter to improve the accuracy of Q/R from music database and a statistic filter to remove continuity and redundancy. The energy-intensive filter uses the Discrite Cosine Transform(DCT)'s feature gathering energy to low-order bits and the statistic filters use the correlation between searched fingerprint's information. Experimental results show that the superiority of proposed algorithm consists of the energy and statistical filtering in noise environment. It is found that the proposed filter engine achieves more robust to noise than Philips Robust Hash(PRH), and a more compact way than MLH.