• Title/Summary/Keyword: Metadata extraction

Search Result 41, Processing Time 0.024 seconds

A Case Study on Metadata Management for User Access to Data Warehouse - Suggestions about metadata management using K-bank case - (사용자의 데이터 웨어하우스 접근과 활용을 위한 메타데이터 관리 사례 - K 은행 사례를 통한 메타데이터 관리의 시사점 -)

  • Kim, Gi-Un
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.225-233
    • /
    • 2007
  • This paper uses the taxonomy of 3 metadata schema(extraction metadata, warehouse, metadata, and user access metadata) to investigate how to manage metadata and what metadata to manage in data warehouse. In particular, this paper focuses on two kinds of metadata(warehouse metadata and user access metadata) and studies a case about metadata management in a real business world

  • PDF

Automatic Extraction of Metadata Information for Library Collections

  • Yang, Gi-Chul;Park, Jeong-Ran
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.2
    • /
    • pp.117-122
    • /
    • 2018
  • As evidenced through rapidly growing digital repositories and web resources, automatic metadata generation is becoming ever more critical, especially considering the costly and complex operation of manual metadata creation. Also, automatic metadata generation is apt to consistent metadata application. In this sense, metadata quality and interoperability can be enhanced by utilizing a mechanism for automatic metadata generation. In this article, a mechanism of automatic metadata extraction called ExMETA is introduced in order to alleviate issues dealing with inconsistent metadata application and semantic interoperability across ever-growing digital collections. Conceptual graph, one of formal languages that represent the meanings of natural language sentences, is utilized for ExMETA as a mediation mechanism that enhances the metadata quality by disambiguating semantic ambiguities caused by isolation of a metadata element and its corresponding definition from the relevant context. Hence, automatic metadata generation by using ExMETA can be a good way of enhancing metadata quality and semantic interoperability.

Extracting and Validating Metadata in Electronic Records (전자기록물의 메타데이터 추출 및 비교 검증 기술 연구)

  • Choi, Joo Ho;Lee, Jae Young
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.12 no.1
    • /
    • pp.7-32
    • /
    • 2012
  • When migrate electronic records, the validation of the required metadata in electronic records and verified with the metadata in the document are also important. This paper presents a method and implements a tool to extract data from files in various formats and use them to validate metadata associated with the files in electronic records. Compared to other metadata extraction tools, especially developed in foreign countries, the standard form of documents used in Korean government is taken into account and metadata is extracted from the content of files. The tool compares the extracted data to encapsulated metadata for validation.

Recognizing Emotional Content of Emails as a byproduct of Natural Language Processing-based Metadata Extraction (이메일에 포함된 감성정보 관련 메타데이터 추출에 관한 연구)

  • Paik, Woo-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.2
    • /
    • pp.167-183
    • /
    • 2006
  • This paper describes a metadata extraction technique based on natural language processing (NLP) which extracts personalized information from email communications between financial analysts and their clients. Personalized means connecting users with content in a personally meaningful way to create, grow, and retain online relationships. Personalization often results in the creation of user profiles that store individuals' preferences regarding goods or services offered by various e-commerce merchants. We developed an automatic metadata extraction system designed to process textual data such as emails, discussion group postings, or chat group transcriptions. The focus of this paper is the recognition of emotional contents such as mood and urgency, which are embedded in the business communications, as metadata.

Program Development for Automatic Extraction and Transformation of Standard Metadata of Geo-spatial Data (공간정보 표준 메타데이터 추출 및 변환 프로그램 개발)

  • Han, Sun-Mook;Lee, Ki-Won
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.5
    • /
    • pp.549-559
    • /
    • 2010
  • In geo-spatial information system building and operation, metadata is one of the crucial factors. Therefore, international and domestic organizations or associations for standardization have developed and distributed geo-based standard metadata to meet public demands. However, because metadata is composed of complicated elements and needs XML storage and management, individual organization which implement and operate practical application system is inclined to define and use its own metadata specifications. In this study, metadata extraction program, that metadata elements are directly extracted from geo-based file formats was developed to easily utilize standard metadata such as ISO/TC 19115, TTAS.KO-10.0139 and TTAS.IS-19115, and those elements are processed into XML. Furthermore, geo-based images sets are applied to another metadata of ISO/TC 19115-2. As well, metadata transformation is needed due to inconsistent or non-corresponding definition among standard metadata; in this program, transformation modules are also implemented to interoperable uses between standard metadata specifications. Widely used data formats are dealt with in this program, but extension for other formats and other metadata specifications is possible, and it is expected that availability of standard metadata is increased, through this kind of development.

Metadata Processing Technique for Similar Image Search of Mobile Platform

  • Seo, Jung-Hee
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.1
    • /
    • pp.36-41
    • /
    • 2021
  • Text-based image retrieval is not only cumbersome as it requires the manual input of keywords by the user, but is also limited in the semantic approach of keywords. However, content-based image retrieval enables visual processing by a computer to solve the problems of text retrieval more fundamentally. Vision applications such as extraction and mapping of image characteristics, require the processing of a large amount of data in a mobile environment, rendering efficient power consumption difficult. Hence, an effective image retrieval method on mobile platforms is proposed herein. To provide the visual meaning of keywords to be inserted into images, the efficiency of image retrieval is improved by extracting keywords of exchangeable image file format metadata from images retrieved through a content-based similar image retrieval method and then adding automatic keywords to images captured on mobile devices. Additionally, users can manually add or modify keywords to the image metadata.

Automatic Generation of Bibliographic Metadata with Reference Information for Academic Journals (학술논문 내에서 참고문헌 정보가 포함된 서지 메타데이터 자동 생성 연구)

  • Jeong, Seonki;Shin, Hyeonho;Ji, Seon-Yeong;Choi, Sungphil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.3
    • /
    • pp.241-264
    • /
    • 2022
  • Bibliographic metadata can help researchers effectively utilize essential publications that they need and grasp academic trends of their own fields. With the manual creation of the metadata costly and time-consuming. it is nontrivial to effectively automatize the metadata construction using rule-based methods due to the immoderate variety of the article forms and styles according to publishers and academic societies. Therefore, this study proposes a two-step extraction process based on rules and deep neural networks for generating bibliographic metadata of scientific articlles to overcome the difficulties above. The extraction target areas in articles were identified by using a deep neural network-based model, and then the details in the areas were analyzed and sub-divided into relevant metadata elements. IThe proposed model also includes a model for generating reference summary information, which is able to separate the end of the text and the starting point of a reference, and to extract individual references by essential rule set, and to identify all the bibliographic items in each reference by a deep neural network. In addition, in order to confirm the possibility of a model that generates the bibliographic information of academic papers without pre- and post-processing, we conducted an in-depth comparative experiment with various settings and configurations. As a result of the experiment, the method proposed in this paper showed higher performance.

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

Comparison of Performance Factors for Automatic Classification of Records Utilizing Metadata (메타데이터를 활용한 기록물 자동분류 성능 요소 비교)

  • Young Bum Gim;Woo Kwon Chang
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.3
    • /
    • pp.99-118
    • /
    • 2023
  • The objective of this study is to identify performance factors in the automatic classification of records by utilizing metadata that contains the contextual information of records. For this study, we collected 97,064 records of original textual information from Korean central administrative agencies in 2022. Various classification algorithms, data selection methods, and feature extraction techniques are applied and compared with the intent to discern the optimal performance-inducing technique. The study results demonstrated that among classification algorithms, Random Forest displayed higher performance, and among feature extraction techniques, the TF method proved to be the most effective. The minimum data quantity of unit tasks had a minimal influence on performance, and the addition of features positively affected performance, while their removal had a discernible negative impact.

A Study on Extraction of Metadata Elements Based on ISAD Rules for Official Document (ISAD에 기반한 공문서 메타데이터 요소 설정에 관한 연구)

  • 남궁황
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.1
    • /
    • pp.231-251
    • /
    • 2004
  • This study aims to collect and manage in the step of creation metadata information to effectively manage and use official document which is a typical and normal records. To do it. data elements are extracted through analyzing structure of official document format. And we also select metadata elements reflecting creation background, publisher's intention, characteristic of official documents through evaluating and comparing extracted elements with data elements defined in ISAD rules. It would be draft data in constructing standardized metadata structure for records in Korea.