• Title/Summary/Keyword: Audio indexing

Search Result 41, Processing Time 0.03 seconds

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

Multi-modal Detection of Anchor Shot in News Video (다중모드 특징을 사용한 뉴스 동영상의 앵커 장면 검출 기법)

  • Yoo, Sung-Yul;Kang, Dong-Wook;Kim, Ki-Doo;Jung, Kyeong-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.12 no.4
    • /
    • pp.311-320
    • /
    • 2007
  • In this paper, an efficient detection algorithm of an anchor shot in news video is presented. We observed the audio visual characteristics of news video and proposed several low level features which are appropriate for detecting an anchor shot in news video. The overall structure of the proposed algorithm is composed of 3 stages: the pause detection, the audio cluster classification, and the matching with motion activity stage. We used the audio features as well as the motion feature in order to improve the indexing accuracy and the simulation results show that the performance of the proposed algorithm is quite satisfactory.

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis (음성정보 내용분석을 통한 골프 동영상에서의 선수별 이벤트 구간 검색)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.674-679
    • /
    • 2009
  • This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.

A Personal Videocasting System with Intelligent TV Browsing for a Practical Video Application Environment

  • Kim, Sang-Kyun;Jeong, Jin-Guk;Kim, Hyoung-Gook;Chung, Min-Gyo
    • ETRI Journal
    • /
    • v.31 no.1
    • /
    • pp.10-20
    • /
    • 2009
  • In this paper, a video broadcasting system between a home-server-type device and a mobile device is proposed. The home-server-type device can automatically extract semantic information from video contents, such as news, a soccer match, and a baseball game. The indexing results are utilized to convert the original video contents to a digested or arranged format. From the mobile device, a user can make recording requests to the home-server-type devices and can then watch and navigate recorded video contents in a digested form. The novelty of this study is the actual implementation of the proposed system by combining the actual IT environment that is available with indexing algorithms. The implementation of the system is demonstrated along with experimental results of the automatic video indexing algorithms. The overall performance of the developed system is compared with existing state-of-the-art personal video recording products.

  • PDF

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.3E
    • /
    • pp.89-94
    • /
    • 2006
  • For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.

Performance Analysis of the Time-series Pattern Index File for Content-based Music Genre Retrieval (내용기반 음악장르 검색에서 시계열 패턴 인덱스 화일의 성능 분석)

  • Kim, Young-In;Kim, Seon-Jong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.5
    • /
    • pp.18-27
    • /
    • 2006
  • Rapid increase of the amount of music data demands for a new method that allows efficient similarity retrieval of music genre using audio features in music databases. To build this similarity retrieval, an indexing techniques that support audio features as a time-series pattern and data mining technologies are needed. In this paper, we address the development of a system that retrieves similar genre music based on the indexing techniques. We first propose the structure of content-based music genre retrieval system based on the time-series pattern index file and data mining technologies. In addition, we implement the time-series pattern index file using audio features and present performance analysis of the time-series pattern index file for similar genre retrieval. The experiments are performed on real data to verify the performance of the proposed method.

  • PDF

A Comparison of Speech/Music Discrimination Features for Audio Indexing (오디오 인덱싱을 위한 음성/음악 분류 특징 비교)

  • 이경록;서봉수;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.10-15
    • /
    • 2001
  • In this paper, we describe the comparison between the combination of features using a speech and music discrimination, which is classifying between speech and music on audio signals. Audio signals are classified into 3classes (speech, music, speech and music) and 2classes (speech, music). Experiments carried out on three types of feature, Mel-cepstrum, energy, zero-crossings, and try to find a best combination between features to speech and music discrimination. We using a Gaussian Mixture Model (GMM) for discrimination algorithm and combine different features into a single vector prior to modeling the data with a GMM. In 3classes, the best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%). In 2classes, the best result is achieved using Mel-cepstrum, energy and Mel-cepstrum, energy, zero-crossings in a single feature vector (speech: 98.9%, music: 100%).

  • PDF

A PROPOSAL OF SEMI-AUTOMATIC INDEXING ALGORITHM FOR MULTI-MEDIA DATABASE WITH USERS' SENSIBILITY

  • Mitsuishi, Takashi;Sasaki, Jun;Funyu, Yutaka
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2000.04a
    • /
    • pp.120-125
    • /
    • 2000
  • We propose a semi-automatic and dynamic indexing algorithm for multi-media database(e.g. movie files, audio files), which are difficult to create indexes expressing their emotional or abstract contents, according to user's sensitivity by using user's histories of access to database. In this algorithm, we simply categorize data at first, create a vector space of each user's interest(user model) from the history of which categories the data belong to, and create vector space of each data(title model) from the history of which users the data had been accessed from. By continuing the above method, we could create suitable indexes, which show emotional content of each data. In this paper, we define the recurrence formulas based on the proposed algorithm. We also show the effectiveness of the algorithm by simulation result.

  • PDF

A Practical Digital Video Database based on Language and Image Analysis

  • Liang, Yiqing
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.24-48
    • /
    • 1997
  • . Supported byㆍDARPA′s image Understanding (IU) program under "Video Retrieval Based on Language and image Analysis" project.DARPA′s Computer Assisted Education and Training Initiative program (CAETI)ㆍObjective: Develop practical systems for automatic understanding and indexing of video sequences using both audio and video tracks(omitted)

  • PDF

A Content-based Audio Retrieval System Supporting Efficient Expansion of Audio Database (음원 데이터베이스의 효율적 확장을 지원하는 내용 기반 음원 검색 시스템)

  • Park, Ji Hun;Kang, Hyunchul
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.811-820
    • /
    • 2017
  • For content-based audio retrieval which is one of main functions in audio service, the techniques for extracting fingerprints from the audio source, storing and indexing them in a database are widely used. However, if the fingerprints of new audio sources are continually inserted into the database, there is a problem that space efficiency as well as audio retrieval performance are gradually deteriorated. Therefore, there is a need for techniques to support efficient expansion of audio database without periodic reorganization of the database that would increase the system operation cost. In this paper, we design a content-based audio retrieval system that solves this problem by using MapReduce and NoSQL database in a cluster computing environment based on the Shazam's fingerprinting algorithm, and evaluate its performance through a detailed set of experiments using real world audio data.