• Title/Summary/Keyword: database schema

Search Result 357, Processing Time 0.018 seconds

Manufacturing Digital Map Version 2.0 Increased Visual Information (시각적 정보력이 향상된 수치지도 Ver. 2.0제작)

  • Park Kyeong Sik;Lee Jae Kee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.23 no.3
    • /
    • pp.221-231
    • /
    • 2005
  • Though Digital Map Ver. 2.0 is adequate to GIS, the possibility to gain information from its external form and the ease of producing paper map had retrogressed. In this research, concerned with the problems, we plan to make Digital Map Ver. 2.0, which satisfies the conditions of GIS. It will have geometrical and logical data structure, and also possess informative ability as much as that of Ver. 1.0. For the study, the process to analyze the topographic code, color, code priority order, etc. of paper relief map, digital map Ver. 1.0, and digital map Ver. 2.0 was taken. For the topographical feature with diverse expression, we changed the portrayal of digital map Ver. 2.0 to make it fit the regulations of map portrayal. At the point of topographic code priority order, the rule is to arrange them in the same order as the real territory. However, we made a special code in the case of any change of the locational order. According to the property of this study, we observe the regulations of map portrayal, for the elements related to subjective sense, such as colors. And we give priority on the data construction when the portrayal of topographical feature and the schema of GIS database system are contradictory.

Storage and Retrieval of XML Documents Without Redundant Path Information (경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법)

  • Lee Hiye-Ja;Jeong Byeong-Soo;Kim Dae-Ho;Lee Young-Koo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.663-672
    • /
    • 2005
  • This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn't require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

A 3-Layered Information Integration System based on MDRs End Ontology (MDR과 온톨로지를 결합한 3계층 정보 통합 시스템)

  • Baik, Doo-Kwon;Choi, Yo-Han;Park, Sung-Kong;Lee, Jeong-Oog;Jeong, Dong-Won
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.247-260
    • /
    • 2003
  • To share and standardize information, especially in the database environments, MDR (Metadata Registry) can be used to integrate various heterogeneous databases within a particular domain. But due to the discrepancies of data element representation between organizations, global information integration is not so easy. And users who are searching integrated information on the Web have limitation to obtain schema information for the underlying source databases. To solve those problems, in this paper, we present a 3-layered Information Integration System (LI2S) based on MDRs and Ontology. The purpose of proposed architecture is to define information integration model, which combine both of the nature of MDRs standard specification and functionality of ontology for the concept and relation. Adopting agent technology to the proposed model plays a key role to support the hierarchical and independent information integration architecture. Ontology is used as for a role of semantic network from which it extracts concept from the user query and the establishment of relationship between MDRs for the data element. (MDR and Knowledge Base are used as for the solution of discrepancies of data element representation between MDRs. Based on this architectural concept, LI2S was designed and implemented.

Expanded Workflow Development for OSINT(Open Source Intelligence)-based Profiling with Timeline (공개정보 기반 타임라인 프로파일링을 위한 확장된 워크플로우 개발)

  • Kwon, Heewon;Jin, Seoyoung;Sim, Minsun;Kwon, Hyemin;Lee, Insoo;Lee, Seunghoon;Kim, Myuhngjoo
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.187-194
    • /
    • 2021
  • OSINT(Open Source Intelligence), rapidly increasing on the surface web in various forms, can also be used for criminal investigations by using profiling. This technique has become quite common in foreign investigative agencies such as the United States. On the other hand, in Korea, it is not used a lot, and there is a large deviation in the quantity and quality of information acquired according to the experience and knowledge level of investigator. Unlike Bazzell's most well-known model, we designed a Korean-style OSINT-based profiling technique that considers the Korean web environment and provides timeline information, focusing on the improved workflow. The database schema to improve the efficiency of profiling is also presented. Using this, we can obtain search results that guarantee a certain level of quantity and quality. And it can also be used as a standard training course. To increase the effectiveness and efficiency of criminal investigations using this technique, it is necessary to strengthen the legal basis and to introduce automation technologies.

Comparative Analysis and Implications of Command and Control(C2)-related Information Exchange Models (지휘통제 관련 정보교환모델 비교분석 및 시사점)

  • Kim, Kunyoung;Park, Gyudong;Sohn, Mye
    • Journal of Internet Computing and Services
    • /
    • v.23 no.6
    • /
    • pp.59-69
    • /
    • 2022
  • For effective battlefield situation awareness and command resolution, information exchange without seams between systems is essential. However, since each system was developed independently for its own purposes, it is necessary to ensure interoperability between systems in order to effectively exchange information. In the case of our military, semantic interoperability is guaranteed by utilizing the common message format for data exchange. However, simply standardizing the data exchange format cannot sufficiently guarantee interoperability between systems. Currently, the U.S. and NATO are developing and utilizing information exchange models to achieve semantic interoperability further than guaranteeing a data exchange format. The information exchange models are the common vocabulary or reference model,which are used to ensure the exchange of information between systems at the content-meaning level. The information exchange models developed and utilized in the United States initially focused on exchanging information directly related to the battlefield situation, but it has developed into the universal form that can be used by whole government departments and related organizations. On the other hand, NATO focused on strictly expressing the concepts necessary to carry out joint military operations among the countries, and the scope of the models was also limited to the concepts related to command and control. In this paper, the background, purpose, and characteristics of the information exchange models developed and used in the United States and NATO were identified, and comparative analysis was performed. Through this, we intend to present implications when developing a Korean information exchange model in the future.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.