• Title/Summary/Keyword: 자동 키워드추출

Search Result 108, Processing Time 0.02 seconds

A Comparative Analysis of Content-based Music Retrieval Systems (내용기반 음악검색 시스템의 비교 분석)

  • Ro, Jung-Soon
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.23-48
    • /
    • 2013
  • This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

Representative Keyword Extraction from Few Documents through Fuzzy Inference (퍼지추론을 이용한 소수 문서의 대표 키워드 추출)

  • 노순억;김병만;허남철
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.9
    • /
    • pp.837-843
    • /
    • 2001
  • In this work, we propose a new method of extracting and weighting representative keywords(RKs) from a few documents that might interest a user. In order to extract RKs, we first extract candidate terms and them choose a number of terms called initial representative keywords (IRKs) from them through fuzzy inference. Then, by expanding and reweighting IRKs using term co-occurrence similarity, the final RKs are obtained. Performance of our approach is heavily influenced by effectiveness of selection method of IRKs so that we choose fuzzy inference because it is more effective in handling the uncertainty inherent in selecting representative keywords of documents. The problem addressed in this paper can be viewed as the one of calculating center of document vectors. So, to show the usefulness of our approach, we compare with two famous methods - Rocchio and Widrow-Hoff - on a number of documents collections. The result show that our approach outperforms the other approaches.

  • PDF

A Study on Web Mining System for Real-Time Monitoring of Opinion Information Based on Web 2.0 (의견정보 모니터링을 위한 웹 마이닝 시스템에 관한 연구)

  • Joo, Hae-Jong;Hong, Bong-Hwa;Jeong, Bok-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.149-157
    • /
    • 2010
  • As the use of the Internet has recently increased, the demand for opinion information posted on the Internet has grown. However, such resources only exist on the website. People who want to search for information on the Internet find it inconvenient to visit each website. This paper focuses on the opinion information extraction and analysis system through Web mining that is based on statistics collected from Web contents. That is, users' opinion information which is scattered across several websites can be automatically analyzed and extracted. The system provides the opinion information search service that enables users to search for real-time positive and negative opinions and check their statistics. Also, users can do real-time search and monitoring about other opinion information by putting keywords in the system. Proposed technologies proved to have outstanding capabilities in comparison to existing ones through tests. The capabilities to extract positive and negative opinion information were assessed. Specifically, test movie review sentence testing data was tested and its results were analyzed.

Medical Image Automatic Annotation Using Multi-class SVM and Annotation Code Array (다중 클래스 SVM과 주석 코드 배열을 이용한 의료 영상 자동 주석 생성)

  • Park, Ki-Hee;Ko, Byoung-Chul;Nam, Jae-Yeal
    • The KIPS Transactions:PartB
    • /
    • v.16B no.4
    • /
    • pp.281-288
    • /
    • 2009
  • This paper proposes a novel algorithm for the efficient classification and annotation of medical images, especially X-ray images. Since X-ray images have a bright foreground against a dark background, we need to extract the different visual descriptors compare with general nature images. In this paper, a Color Structure Descriptor (CSD) based on Harris Corner Detector is only extracted from salient points, and an Edge Histogram Descriptor (EHD) used for a textual feature of image. These two feature vectors are then applied to a multi-class Support Vector Machine (SVM), respectively, to classify images into one of 20 categories. Finally, an image has the Annotation Code Array based on the pre-defined hierarchical relations of categories and priority code order, which is given the several optimal keywords by the Annotation Code Array. Our experiments show that our annotation results have better annotation performance when compared to other method.

A Question Example Generation System for Multiple Choice Tests by utilizing Concept Similarity in Korean WordNet (한국어 워드넷에서의 개념 유사도를 활용한 선택형 문항 생성 시스템)

  • Kim, Young-Bum;Kim, Yu-Seop
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.125-134
    • /
    • 2008
  • We implemented a system being able to suggest example sentences for multiple choice tests, considering the level of students. To build the system, we designed an automatic method for sentence generation, which made it possible to control the difficulty degree of questions. For the proper evaluation in the multiple choice tests, proper size of question pools is required. To satisfy this requirement, a system which can generate various and numerous questions and their example sentences in a fast way should be used. In this paper, we designed an automatic generation method using a linguistic resource called WordNet. For the automatic generation, firstly, we extracted keywords from the existing sentences with the morphological analysis and candidate terms with similar meaning to the keywords in Korean WordNet space are suggested. When suggesting candidate terms, we transformed the existing Korean WordNet scheme into a new scheme to construct the concept similarity matrix. The similarity degree between concepts can be ranged from 0, representing synonyms relationships, to 9, representing non-connected relationships. By using the degree, we can control the difficulty degree of newly generated questions. We used two methods for evaluating semantic similarity between two concepts. The first one is considering only the distance between two concepts and the second one additionally considers positions of two concepts in the Korean Wordnet space. With these methods, we can build a system which can help the instructors generate new questions and their example sentences with various contents and difficulty degree from existing sentences more easily.

Content-based Video Indexing and Retrieval System using MPEG-7 Standard (MPEG-7 표준에 따른 내용기반 비디오 검색 시스템)

  • 김형준;김회율
    • Journal of Broadcast Engineering
    • /
    • v.9 no.2
    • /
    • pp.151-163
    • /
    • 2004
  • In this paper, we propose a content-based video indexing and retrieval system using MPEG-7 standard to retrieve and manage videos efficiently. The proposed system consists of video indexing module for a video DB and video retrieval module to allow various query methods on a web environment. Video indexing module stores metadata such as manually typed in keywords, automatically recognized character names, and MPEG-7 visual descriptors extracted by indexing module into a DB in a sever side. A user can access to retrieval module by a web and retrieve desired videos through various query methods like keywords, faces, example and sketch. For this retrieval system, we propose ATC(Adaptive Twin Comparison) as a cut detection method for efficient video indexing and QBME(Query By Modified Example) as an improved content-based query method for the convenience of users. Experimental results show that the proposed ATC method detects cuts well and the proposed QBME method provides the conveniences better than existing query methods such as QBE(Query By Example) and QBS(Query By Sketch).

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

Robot Design Trend Analysis using the Interactive Mapping Method (인터랙티브 매핑 기법을 활용한 로봇 디자인 트랜드 분석)

  • Seo, Jong-Hwan;Byeon, Jae-Hyeong;Kim, Myeong-Seok
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2007.05a
    • /
    • pp.164-167
    • /
    • 2007
  • 2D 평면상에 이미지를 매핑 하는 것은 디자인 프로세스의 초기에 디자인 트랜드를 이해하는 방법의 하나로써 자주 이용되어 왔다. 이러한 유형의 분석 방법은 로봇 디자인 과정에도 필요하다. 본 연구는 로봇 디자인 분석을 위한 보다 진보된 방법으로써의 인터랙티브 맵의 개발과 활용에 초점을 맞추고 있다. 우선 매핑을 위한 기초 자료로써 로봇 디자인 샘플들을 대상으로 휴리스틱 평가와 사용자 설문조사가 실시하였다. 그 결과는 선형 스케일로 변환되었으며 이를 기초로 매크로미디어사의 플래시를 이용한 인터랙티브 매핑 툴이 개발되었다. 본 연구에서 개발된 인터랙티브 맵은 로봇 디자인의 객관적인 특성을 나타내는 6가지 키워드와 사용자의 유형에 따른 9가지 주관적인 선호도로부터 추출되는 2개의 요소들로 구성되는 105가지 맵을 제시할 수 있다. 디자이너가 2가지 다른 요소들을 자유롭게 선택함에 따라 선택된 2가지 요소를 축으로 하는 이미지 맵이 자동적으로 구성되어 제시된다. 본 연구에서는 이와 같은 인터랙티브 맵을 이용해 실제 사례연구를 진행함으로써 디자이너들이 보다 다양한 발견점과 직관적인 통찰력을 얻을 수 있음을 제시하였다. 이러한 방법은 기존의 전통적인 직접적인 2D 매핑과 비교해서 보다 객관적이고 효율적인 방법으로 생각된다.

  • PDF

An Entity-centric Integrated Search System Using URI (URI를 이용한 개체 중심적 통합 검색 시스템)

  • Jung, Han-Min;Lee, Mi-Kyoung;Sung, Won-Kyung
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.7
    • /
    • pp.405-416
    • /
    • 2008
  • To overcome the limitation of keyword-based integrated search, this study shows entity-centric integrated search method using URI scheme. Our system generates entity pages in ways of analyzing user's keyword and instances matched with it, selecting optimal entity type, and calling unit services simultaneously. Topic information extracted from articles is propagated to persons, institutions, and locations by reasoning for providing topic-centric information. With comparative experiments based on search results and usability tests, we proved that this approach is superior to keyword-based integrated search served by CiteSeer and Google Scholar.

A Study on the Development of Ontology based on the Jewelry Brand Information (귀금속.보석 상품정보 온톨로지 구축에 관한 연구)

  • Lee, Ki-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.7
    • /
    • pp.247-256
    • /
    • 2008
  • This research is to develop product retrieval system through simplified communication by applying intelligent agent technology based on automatically created domain ontology to present solution on problems with e-commerce system which searches in the web documents with a simple keyword. Ontology development extracts representative term based on classification information of international product classification code(UNSPSC) and jewelry websites that is applied to analogy relationship thesaurus to establish standardized ontology. The intelligent agent technology is applied to retrieval stage to support efficiency of information collection for users by designing and developing e-commerce system supported with semantic web. Moreover, it designs user profile to personalized search environment and provide personalized retrieval agent and retrieval environment with inference function to make available with fast information collection and accurate information search.

  • PDF