• Title/Summary/Keyword: machine-readable

Search Result 84, Processing Time 0.028 seconds

A Study on the Interoperability of INDECS and Metadata (INDECS와 기존 메타데이터간의 상호운용성에 관한 연구)

  • 윤세진;오경묵;황상규
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2001.08a
    • /
    • pp.131-134
    • /
    • 2001
  • 본 연구에서는 현재 DOI를 위한 메타데이터로 채택되어 활발하게 연구되고 있는 전자상거래를 위한 메타데이터 INDECS(Interoperability of Data in E-Commerce Systems)를 전통적 형태의 메타데이터인 MARC21 (MAchine Readable Cataloging)과 더블린코어(Dublin Core) 메타데이터간의 데이터 요소들을 비교하여, 서로 다른 메타데이터간의 상호운용성을 제공하기 위한 방안에 대하여 연구하였다.

  • PDF

Developing a Multi-purpose Ecotoxicity Database Model and Web-based Searching System for Ecological Risk Assessment of EDCs in Korea (웹 기반 EDCs 생태 독성 자료베이스 모델 및 시스템 개발)

  • Kwon, Bareum;Lee, Hunjoo
    • Journal of Environmental Health Sciences
    • /
    • v.43 no.5
    • /
    • pp.412-421
    • /
    • 2017
  • Objectives: To establish a system for integrated risk assessment of EDCs in Korea, infrastructure for providing toxicity data of ecological media should be established. Some systems provide soil ecotoxicity databases along with aquatic ecotoxicity information, but a well-structured ecotoxicity database system is still lacking. Methods: Aquatic and soil ecotoxicological information were collected by a toxicologist based on a human readable data (HRD) format for collecting ecotoxicity data that we provided. Among these data, anomalies were removed according to database normalization theory. Also, the data were cleaned and encoded to establish a machine-readable data (MRD) ecotoxicity database system. Results: We have developed a multi-purpose ecotoxicity database model focusing on EDCs, ecological species, and toxic effects. Also, we have constructed a web-based data searching system to retrieve, extract, and download data with greater availability. Conclusions: The results of our study will contribute to decision-making as a tool for efficient ecological risk assessment of EDCs in Korea.

Interpretation of Noun Sequence using Semantic Information Extracted from Machine Readable Dictionary and Corpus (기계가독형사전과 코퍼스에서 추출한 의미정보를 이용한 명사열의 의미해석)

  • 이경순;김도완;김길창;최기선
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.1_2
    • /
    • pp.11-24
    • /
    • 2001
  • The interpretation of noun sequence is to find semantic relation between the nouns in noun sequence. To interpret noun sequence, semantic knowledge about words and relation between words is required. In this thesis, we propose a method to interpret a semantic relation between nouns in noun sequence. We extract semantic information from an machine readable dictionary (MRD) and corpus using regular expressions. Based on the extracted information, semantic relation of noun sequence is interpreted. And. we use verb subcategorization information together with the semantic information from an MRD and corpus. Previous researches use semantic knowledge extracted only from an MRD but our method uses an MRD. corpus. and subcategorizaton information to interpret noun sequences. Experimental result shows that our method improves the accuracy rate by +40.30% and the coverage rate by + 12.73% better than previous researches.

  • PDF

Extracting Korean-English Parallel Sentences from Wikipedia (위키피디아로부터 한국어-영어 병렬 문장 추출)

  • Kim, Sung-Hyun;Yang, Seon;Ko, Youngjoong
    • Journal of KIISE:Software and Applications
    • /
    • v.41 no.8
    • /
    • pp.580-585
    • /
    • 2014
  • This paper conducts a variety of experiments for "the extraction of Korean parallel sentences using Wikipedia data". We refer to various methods that were previously proposed for other languages. We use two approaches. The first one is to use translation probabilities that are extracted from the existing resources such as Sejong parallel corpus, and the second one is to use dictionaries such as Wiki dictionary consisting of Wikipedia titles and MRDs (machine readable dictionaries). Experimental results show that we obtained a significant improvement in system using Wikipedia data in comparison to one using only the existing resources. We finally achieve an outstanding performance, an F1-score of 57.6%. We additionally conduct experiments using a topic model. Although this experiment shows a relatively lower performance, an F1-score of 51.6%, it is expected to be worthy of further studies.

A Study on Converting bibliographic data of public libraries expressed in KORMARC into BIBFARME

  • Kim, Joo-Yong;Shin, Pan-Seop
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.139-147
    • /
    • 2021
  • BIBFRAME, which is attracting attention as an alternative to the machine-readable catalog format (MARC) in the library world, presents a new bibliographic data model in the open web environment while maintaining compatibility with existing data. To convert KORMARC(Korean data model of MARC) records into BIBFRAME, we extract 25 key fields by analyzing the latest 5,000 bibliographic data from Nowon-gu Library in Seoul. The extracted core fields are classified into three types according to the compatibility of MARC 21, and define conversion rules for each type. In addition, implement an open source-based converter to perform KORMARC to BIBFRAME conversion. As a basic study on KORMARC to BIBFRAME conversion, this study is meaningful in that it analyzes the latest KORMARC information actually used, defines conversion rules, and attempts BIBFRAME conversion.

A Java Virtual Machine for Sensor Networks (센서 네트워크를 위한 자바 가상 기계)

  • Kim, Seong-Woo;Lee, Jong-Min;Lee, Jung-Hwa;Shin, Jin-Ho
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.14 no.1
    • /
    • pp.13-20
    • /
    • 2008
  • Sensor network consists of a large number of sensor node distributed in the environment being sensed and controlled. The resource-constrained sensor nodes tend to have various and heterogeneous architecture. Thus, it is important to make its software environment platform-independent and reprogrammable. In this paper, we present BeeVM, a Java operating system designed for sensor networks. BeeVM offers a platform-independent Java programming environment with its efficiently executable file format and a set of class APIs for basic operating functions, sensing and wireless networking. BeeVM's high-level native interface and layered network subsystem allow complex program for sensor network to be short and readable. Our platform has been ported on two currently popular hardware platforms and we show its effectiveness through the evaluation of a simple application.

A Study of Ontology-based Cataloguing System Using OWL (OWL을 이용한 온톨로지 기반의 목록시스템 설계 연구)

  • 이현실;한성국
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.2
    • /
    • pp.249-267
    • /
    • 2004
  • Although MARC can define the detail cataloguing data, it has complex structures and frameworks to represent bibliographic information. On account of these idiosyncratic features of MARC, XML DTD or RDF/S that supports simple hierarchy of conceptual vocabularies cannot capture MARC formalism effectively. This study implements bibliographic ontology by means of abstracting conceptual relationships between bibliographic vocabularies of MARC. The bibliographic ontology is formalized with OWL that can represent the logical relations between conceptual elements and specify cardinality and property value restrictions. The bibliographic ontology in this study will provide metadata for cataloguing data and resolve compatibility problems between cataloguing systems. And it can also contribute the development of next generation bibliographic information system using semantic Web services.

Implementation of Nondeterministic Compiler Using Monad (모나드를 이용한 비결정적 컴파일러 구현)

  • Byun, Sugwoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.2
    • /
    • pp.151-159
    • /
    • 2014
  • We discuss the implementation of a compiler for an imperative programming language, using monad in Haskell. This compiler involves a recursive-descent parser conducting nondeterministic parsing, in which backtracking occurs to try with other rules when the application of a production rule fails to parse an input string. Haskell has some strong facilities for parsing. Its algebraic types represent abstract syntax trees in a smooth way, and program codes by monad parsing are so concise that they are highly readable and code size is reduced significantly, comparing with other languages. We also deal with the runtime environment of the assembler and code generation whose target is the Stack-Assembly language based on a stack machine.

Detection of Malicious PDF based on Document Structure Features and Stream Objects

  • Kang, Ah Reum;Jeong, Young-Seob;Kim, Se Lyeong;Kim, Jonghyun;Woo, Jiyoung;Choi, Sunoh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.85-93
    • /
    • 2018
  • In recent years, there has been an increasing number of ways to distribute document-based malicious code using vulnerabilities in document files. Because document type malware is not an executable file itself, it is easy to bypass existing security programs, so research on a model to detect it is necessary. In this study, we extract main features from the document structure and the JavaScript contained in the stream object In addition, when JavaScript is inserted, keywords with high occurrence frequency in malicious code such as function name, reserved word and the readable string in the script are extracted. Then, we generate a machine learning model that can distinguish between normal and malicious. In order to make it difficult to bypass, we try to achieve good performance in a black box type algorithm. For an experiment, a large amount of documents compared to previous studies is analyzed. Experimental results show 98.9% detection rate from three different type algorithms. SVM, which is a black box type algorithm and makes obfuscation difficult, shows much higher performance than in previous studies.

An Algorithm for Predicting the Relationship between Lemmas and Corpus Size

  • Yang, Dan-Hee;Gomez, Pascual Cantos;Song, Man-Suk
    • ETRI Journal
    • /
    • v.22 no.2
    • /
    • pp.20-31
    • /
    • 2000
  • Much research on natural language processing (NLP), computational linguistics and lexicography has relied and depended on linguistic corpora. In recent years, many organizations around the world have been constructing their own large corporal to achieve corpus representativeness and/or linguistic comprehensiveness. However, there is no reliable guideline as to how large machine readable corpus resources should be compiled to develop practical NLP software and/or complete dictionaries for humans and computational use. In order to shed some new light on this issue, we shall reveal the flaws of several previous researches aiming to predict corpus size, especially those using pure regression or curve-fitting methods. To overcome these flaws, we shall contrive a new mathematical tool: a piecewise curve-fitting algorithm, and next, suggest how to determine the tolerance error of the algorithm for good prediction, using a specific corpus. Finally, we shall illustrate experimentally that the algorithm presented is valid, accurate and very reliable. We are confident that this study can contribute to solving some inherent problems of corpus linguistics, such as corpus predictability, compiling methodology, corpus representativeness and linguistic comprehensiveness.

  • PDF