• Title/Summary/Keyword: Audio Indexing

Search Result 41, Processing Time 0.027 seconds

Expected Matching Score Based Document Expansion for Fast Spoken Document Retrieval (고속 음성 문서 검색을 위한 Expected Matching Score 기반의 문서 확장 기법)

  • Seo, Min-Koo;Jung, Gue-Jun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.71-74
    • /
    • 2006
  • Many works have been done in the field of retrieving audio segments that contain human speeches without captions. To retrieve newly coined words and proper nouns, subwords were commonly used as indexing units in conjunction with query or document expansion. Among them, document expansion with subwords has serious drawback of large computation overhead. Therefore, in this paper, we propose Expected Matching Score based document expansion that effectively reduces computational overhead without much loss in retrieval precisions. Experiments have shown 13.9 times of speed up at the loss of 0.2% in the retrieval precision.

  • PDF

Content-based Music Retrieval using TIP-indexing Techniques and Features of Audio files (TIP-인덱싱 기법과 오디오 화일의 특징계수를 이용한 내용기반 음악 검색)

  • Kim Young-In
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2006.05a
    • /
    • pp.201-204
    • /
    • 2006
  • 최근에 내용기반 음악 정보 검색시스템과 관련하여 많은 연구들이 수행되고 있다. 이러한 노력의 결과로 자연스러운 음악 정보 검색을 위한 오디오 데이터를 이용한 내용기반 검색 방법에 대한 연구가 활발히 진행되고 있으며, 이러한 시스템에서는 대량의 음악특징 계수를 검색에 사용하고 있다. 하지만, 대량의 연속된 특징 계수를 저장 및 검색하는 방법으로 제안된 TIP-인덱스 화일을 이용한 연구는 부족한 실정이다. 본 논문에서는 연속 특징 계수를 효율적으로 인덱싱하는 기법의 하나인 TIP-인덱스 화일을 이용한 음악정보 검색 방법을 제안하고, 다양한 장르의 음악 오디오 화일에서 특징 계수를 추출하여 TIP-인덱스를 구축하여 실험하였으며, 실험 결과를 통하여 제안한 방법이 음악 정보 검색에서 좋은 성능을 보일 수 있음을 제시하였다.

  • PDF

Retrieval Efficiency Analysis For Audio Data Indexing (오디오 데이터 인덱싱의 검색 효율 분석)

  • Cho, Yong-Choon;Lee, Bae-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10b
    • /
    • pp.1297-1300
    • /
    • 2000
  • 본 논문에서는 멀티미디어 데이터 중에서 오디오 데이터의 검색을 위해 Wavelet 변환을 이용한 인덱싱 방법에 대해 서술한다. 오디오 데이터는 그 자신이 가지고 있는 특징 때문에 좋은 검색효율을 위한 인덱스를 구성하기가 까다롭다. 여기서 Wavelet을 이용한 인덱스는 데이터를 블록으로 나누지 않고 인덱싱 하고, 이 방법을 이용한 검색효율에 대해 서술한다. 즉 Wavelet의 마지막 단계의 고주파 부분과 저주파 부분에서 고주파 부분은 String Matching 기법으로 블록을 결정하고, 저주파 부분은 결정된 블록에 대해서 세부적인 비교를 한다. 실험은 적절한 비교 계수를 결정하기 위한 실험과, 질의 길이의 변화에 따른 검색율의 변화를 보여준다. 마지막 결론에서는 본 논문에서 제안한 방법을 이용한 발전방향과 응용에 대해서 서술한다.

  • PDF

A Digital Library Prototype for Access to Diverse Collections (다양한 장서 접근을 위한 디지털 도서관의 프로토타입 구축)

  • Choi Won-Tae
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.32 no.2
    • /
    • pp.295-307
    • /
    • 1998
  • This article is an overview of the digital library project, indicating what roles Koreas diverse digital collections may play. Our digital library prototype has simple architecture, consisting of digital repositories, filters, indexing and searching, and clients. Digital repositories include various types of materials and databases. The role of filters is to recognize a format of a document collection and mark the structural components of each of its documents. We are using a database management system (ORACLE and ConText) supporting user-defined functions and access methods that allows us to easily incorporate new object analysis, structuring, and indexing technology into a repository. Clients can be considered browsers or viewers designed for different document data types, such as image, audio, video, SGML, PDF, and KORMARC. The combination of navigational tools supports a variety of approaches to identifying collections and browsing or searching for individual items. The search interface was implemented using HTML forms and the World Wide Web's CGI mechanism.

  • PDF

Music Source Signature Indexing Method for Quick Search (빠른 검색을 위한 음원 시그니처 인덱싱 방법)

  • Kim, Sang-Kyun;Lee, Kyoung-Sik
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.321-326
    • /
    • 2021
  • Blockchain is increasing in value as a platform for safe transmission of capital transactions or secure data. In addition, blockchain has the potential as a new platform that can safely store large amounts of data such as videos, music, and photos, and safely manage transaction details and service usage specifications. Since it is not possible to store large-capacity media data in a block, research on the performance of storing sound source information in a block and retrieving the stored sound source data by using the distributed storage system (IPFS) and the hash information of the sound source signature data was conducted. In this paper, we propose a sound source signature indexing method using a bloom filter that can improve the search speed suggested by previous studies. As a result of the experiment, it was confirmed that improved search performance (O(1)) than the existing search performance (O(n)) can be achieved.

Metadata Design and Machine Learning-Based Automatic Indexing for Efficient Data Management of Image Archives of Local Governments in South Korea (국내 지자체 사진 기록물의 효율적 관리를 위한 메타데이터 설계 및 기계학습 기반 자동 인덱싱 방법 연구)

  • Kim, InA;Kang, Young-Sun;Lee, Kyu-Chul
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.20 no.2
    • /
    • pp.67-83
    • /
    • 2020
  • Many local governments in Korea provide online services for people to easily access the audio-visual archives of events occurring in the area. However, the current method of managing these archives of the local governments has several problems in terms of compatibility with other organizations and convenience for searching of the archives because of the lack of standard metadata and the low utilization of image information. To solve these problems, we propose the metadata design and machine learning-based automatic indexing technology for the efficient management of the image archives of local governments in Korea. Moreover, we design metadata items specialized for the image archives of local governments to improve the compatibility and include the elements that can represent the basic information and characteristics of images into the metadata items, enabling efficient management. In addition, the text and objects in images, which include pieces of information that reflect events and categories, are automatically indexed based on the machine learning technology, enhancing users' search convenience. Lastly, we developed the program that automatically extracts text and objects from image archives using the proposed method, and stores the extracted contents and basic information in the metadata items we designed.

Estimation of Lifetime Data Storage Capacity for Human Senses (인간 감각 정보를 위한 평생 기억용량 평가)

  • You, Young-Gap;Song, Young-Jun;Kim, Dong-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.23-29
    • /
    • 2009
  • This paper presents a capacity estimation of a storage system accumulating all data sensed during the lifetime of an individual human being. The calculation assumes modern data compression and data collection schemes based on wearable or implanted devices under ubiquitous environment. More than 76% of the storage area is found to be used for video data storage of common TV image quality. The remaining storage area is for data from other sensing organs including audio, taste, olfactory and tactual systems in addition to indexing information. Total storage area of around 600 tera bytes is needed to cover 100 years of human life including his fetal period.

XML Repository System Using DBMS and IRS

  • Kang, Hyung-Il;Yoo, Jae-Soo;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • v.3 no.3
    • /
    • pp.6-14
    • /
    • 2007
  • In this paper, we design and implement a XML Repository System(XRS) that exploits the advantages of DBMSs and IRSs. Our scheme uses BRS to support full text indexing and content-based queries efficiently, and ORACLE to store XML documents, multimedia data, DTD and structure information. We design databases to manage XML documents including audio, video, images as well as text. We employ the non-composition model when storing XML documents into ORACLE. We represent structured information as ETID(Element Type Id), SORD(Sibling ORDer) and SSORD(Same Sibling ORDer). ETID is a unique value assigned to each element of DTD. SORD and SSORD represent an order information between sibling nodes and an order information among the sibling nodes with the same element respectively. In order to show superiority of our XRS, we perform various experiments in terms of the document loading time, document extracting time and contents retrieval time. It is shown through experiments that our XRS outperforms the existing XML document management systems. We also show that it supports various types of queries through performance experiments.

Speaker-Dependent Emotion Recognition For Audio Document Indexing

  • Hung LE Xuan;QUENOT Georges;CASTELLI Eric
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.92-96
    • /
    • 2004
  • The researches of the emotions are currently great interest in speech processing as well as in human-machine interaction domain. In the recent years, more and more of researches relating to emotion synthesis or emotion recognition are developed for the different purposes. Each approach uses its methods and its various parameters measured on the speech signal. In this paper, we proposed using a short-time parameter: MFCC coefficients (Mel­Frequency Cepstrum Coefficients) and a simple but efficient classifying method: Vector Quantification (VQ) for speaker-dependent emotion recognition. Many other features: energy, pitch, zero crossing, phonetic rate, LPC... and their derivatives are also tested and combined with MFCC coefficients in order to find the best combination. The other models: GMM and HMM (Discrete and Continuous Hidden Markov Model) are studied as well in the hope that the usage of continuous distribution and the temporal behaviour of this set of features will improve the quality of emotion recognition. The maximum accuracy recognizing five different emotions exceeds $88\%$ by using only MFCC coefficients with VQ model. This is a simple but efficient approach, the result is even much better than those obtained with the same database in human evaluation by listening and judging without returning permission nor comparison between sentences [8]; And this result is positively comparable with the other approaches.

  • PDF

An Exploratory Investigation on Multimedia Information Needs and Searching Behavior among College Students (멀티미디어 정보요구와 검색행태에 관한 탐색적 연구)

  • Chung, Eun-Kyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.46 no.3
    • /
    • pp.251-270
    • /
    • 2012
  • Multimedia needs and searching have become important in everyday life, especially in a younger generation. The characteristics of multimedia needs and searching behaviors are distinctive compared to textual information needs and searching behaviors in a wide variety of ways. By interviewing and observing multimedia needs and searching behaviors of college students from 20 areas in Seoul, this study aims to improve the understanding on users' multimedia needs and how users search multimedia. The findings are presented in terms of searching sources, multimedia needs, relevance criteria and searching barriers. For multimedia, the searching sources are found primarily as Naver and Google and the distinguished features are presented depending on the individual multimedia types. As multimedia needs are categorized into generic, specific and abstract, most of the needs are classified as specific needs rather than generic needs, but there exist differences depending on the types of multimedia. In addition, the aspects of relevance criteria and searching barriers are reflected with the characteristics of individual multimedia types. The findings of this study demonstrate that distinctive indexing and searching environments depending on the types of multimedia might be necessary to improve the quality of multimedia searching.