• Title/Summary/Keyword: Audio Information Retrieval

Search Result 73, Processing Time 0.02 seconds

Audio Fingerprint Retrieval Method Based on Feature Dimension Reduction and Feature Combination

  • Zhang, Qiu-yu;Xu, Fu-jiu;Bai, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.522-539
    • /
    • 2021
  • In order to solve the problems of the existing audio fingerprint method when extracting audio fingerprints from long speech segments, such as too large fingerprint dimension, poor robustness, and low retrieval accuracy and efficiency, a robust audio fingerprint retrieval method based on feature dimension reduction and feature combination is proposed. Firstly, the Mel-frequency cepstral coefficient (MFCC) and linear prediction cepstrum coefficient (LPCC) of the original speech are extracted respectively, and the MFCC feature matrix and LPCC feature matrix are combined. Secondly, the feature dimension reduction method based on information entropy is used for column dimension reduction, and the feature matrix after dimension reduction is used for row dimension reduction based on energy feature dimension reduction method. Finally, the audio fingerprint is constructed by using the feature combination matrix after dimension reduction. When speech's user retrieval, the normalized Hamming distance algorithm is used for matching retrieval. Experiment results show that the proposed method has smaller audio fingerprint dimension and better robustness for long speech segments, and has higher retrieval efficiency while maintaining a higher recall rate and precision rate.

Pretreatment For The Problem Solution Of Contents-Based Music Retrieval (내용 기반 음악 검색의 문제점 해결을 위한 전처리)

  • Chung, Myoung-Beom;Sung, Bo-Kyung;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.6
    • /
    • pp.97-104
    • /
    • 2007
  • This paper presents the problem of the feature extraction techniques that has been used a content-based analysis, classification and retrieval in audio data and proposes a course of the preprocessing for a new contents-based retrieval methods. Because the feature vector according to sampling value changes, the existing audio data analysis is problem that same music is appraised by other music. Therefore, we propose waveform information extraction method of PCM data for retrieval audio data of various format to contents-based. If this method is used. we can find that audio datas that get into sampling in various format are same data. And it may be applied in contents-based music retrieval system. To verity the performance of the method, an experiment was done feature extraction using STFT and waveform information extraction using PCM data. As a result, we could know that the method to propose is effective more.

  • PDF

An Efficient Audio Indexing Scheme based on User Query Patterns (사용자 질의 패턴을 이용한 효율적인 오디오 색인기법)

  • 노승민;박동문;황인준
    • Journal of KIISE:Databases
    • /
    • v.31 no.4
    • /
    • pp.341-351
    • /
    • 2004
  • With the popularity of digital audio contents, querying and retrieving audio contents efficiently from database has become essential. In this paper, we propose a new index scheme for retrieving audio contents efficiently using audio portions that have been queried frequently. This scheme is based on the observation that users have a tendency to memorize and query a small number of audio portions. Detecting and indexing such portions enables fast retrieval and shows better performance than sequential search-based audio retrieval. Moreover, this scheme is independent of underlying retrieval system, which means this scheme can work together with any other audio retrieval system. We have implemented a prototype system and showed its performance gain through experiments.

Musician Search in Time-Series Pattern Index Files using Features of Audio (오디오 특징계수를 이용한 시계열 패턴 인덱스 화일의 뮤지션 검색 기법)

  • Kim, Young-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.69-74
    • /
    • 2006
  • The recent development of multimedia content-based retrieval technologies brings great attention of musician retrieval using features of a digital audio data among music information retrieval technologies. But the indexing techniques for music databases have not been studied completely. In this paper, we present a musician retrieval technique for audio features using the space split methods in the time-series pattern index file. We use features of audio to retrieve the musician and a time-series pattern index file to search the candidate musicians. Experimental results show that the time-series pattern index file using the rotational split method is efficient for musician retrievals in the time-series pattern files.

  • PDF

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis (음성정보 내용분석을 통한 골프 동영상에서의 선수별 이벤트 구간 검색)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.674-679
    • /
    • 2009
  • This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.

Representative Melodies Retrieval using Waveform and FFT Analysis of Audio (오디오의 파형과 FFT 분석을 이용한 대표 선율 검색)

  • Chung, Myoung-Bum;Ko, Il-Ju
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.12
    • /
    • pp.1037-1044
    • /
    • 2007
  • Recently, we extract the representative melody of the music and index the music to reduce searching time at the content-based music retrieval system. The existing study has used MIDI data to extract a representative melody but it has a weak point that can use only MIDI data. Therefore, this paper proposes a representative melody retrieval method that can be use at all audio file format and uses digital signal processing. First, we use Fast Fourier Transform (FFT) and find the tempo and node for the representative melody retrieval. And we measure the frequency of high value that appears from PCM Data of each node. The point which the high value is gathering most is the starting point of a representative melody and an eight node from the starting point is a representative melody section of the audio data. To verity the performance of the method, we chose a thousand of the song and did the experiment to extract a representative melody from the song. In result, the accuracy of the extractive representative melody was 79.5% among the 737 songs which was found tempo.

A Study on Improvement of Retrieval Algorithm for Audio Response Service (음성정보 서비스의 검색 알고리즘 개선 연구)

  • Jeong, Yoo-Hyeon;Kim, Soon-Hyop
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.5
    • /
    • pp.92-95
    • /
    • 1997
  • Thlephone pushbuttons simply consist of 0~9 digits, #, and ${\ast}$). So it is difficulty for user to input the various query command for information retrieval of audio response sevice. We suggest the new retrieval algorithm for audio response service using Korean initial sounds sequences. User those who do not know the retrieval code can retrieve the audio response service by pushing the telephone digit buttons which correspond to initial sounds of its name.

  • PDF

A Study on the input butter for efficient processing of MPEG Audio bitstream (MPEG Audio 비트스트림의 효율적 처리를 위한 입력 버퍼에 관한 연구)

  • 임성룡;공진흥
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.181-184
    • /
    • 2000
  • In this paper, we described a design of the input buffer system for efficiently dealing with MPEG audio bitstream to demux header and side information, audio data. In order to overcome the limitations of fixed-word manipulation in bitstream demuxing, we proposed a new variable length bit retrieval system with FSM sequencer supporting MPEG audio frame format, and serial buffer demuxing audio stream, FIFO circular buffer including header and side information.

  • PDF

XCRAB : A Content and Annotation-based Multimedia Indexing and Retrieval System (XCRAB :내용 및 주석 기반의 멀티미디어 인덱싱과 검색 시스템)

  • Lee, Soo-Chelo;Rho, Seung-Min;Hwang, Een-Jun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.587-596
    • /
    • 2004
  • During recent years, a new framework, which aims to bring a unified and global approach in indexing, browsing and querying various digital multimedia data such as audio, video and image has been developed. This new system partitions each media stream into smaller units based on actual physical events. These physical events within oath media stream can then be effectively indexed for retrieval. In this paper, we present a new approach that exploits audio, image and video features to segment and analyze the audio-visual data. Integration of audio and visual analysis can overcome the weakness of previous approach that was based on the image or video analysis only. We Implement a web-based multi media data retrieval system called XCRAB and report on its experiment result.

Analysis of Pre-Processing Methods for Music Information Retrieval in Noisy Environments using Mobile Devices

  • Kim, Dae-Jin;Koo, Ddeo-Ol-Ra
    • International Journal of Contents
    • /
    • v.8 no.2
    • /
    • pp.1-6
    • /
    • 2012
  • Recently, content-based music information retrieval (MIR) systems for mobile devices have attracted great interest. However, music retrieval systems are greatly affected by background noise when music is recorded in noisy environments. Therefore, we evaluated various pre-processing methods using the Philips method to determine the one that performs most robust music retrieval in such environments. We found that dynamic noise reduction (DNR) is the best pre-processing method for a music retrieval system in noisy environments.