• 제목/요약/키워드: Heterogeneous Data Integration

검색결과 167건 처리시간 0.033초

Data hub system based on SQL/XMDR message using Wrapper for distributed data interoperability (분산 데이터 상호운용을 위한 SQL/XMDR 메시지 기반의 Wrapper를 이용한 데이터 허브 시스템)

  • Moon, Seok-Jae;Jung, Gye-Dong;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제11권11호
    • /
    • pp.2047-2058
    • /
    • 2007
  • The business environment of enterprises could be difficult to obviate redundancy to filtrate data source occurred on data integrated to standard rules and meta-data and to produce integration of data and single viewer in geographical and spatial distributed environment. Specially, To can interchange various data from a heterogeneous system or various applications without types and forms and synchronize continually exactly integrated information#s is of paramount concern. Therefore data hub system based on SQL/XMDR message to overcome a problem of meaning interoperability occurred on exchanging or jointing between each legacy systems are proposed in this paper. This system use message mapping technique of query transform system to maintain data modified in real-time on cooperating data. It can consistently maintain data modified in realtime on exchanging or jointing data for cooperating legacy systems, it improve clarity and availability of data by providing a single interface on data retrieval.

Microarray Data Sharing System (마이크로어레이 데이터 공유 시스템)

  • Yoon, Jee-Hee;Hong, Dong-Wan;Lee, Jong-Keun
    • The Journal of the Korea Contents Association
    • /
    • 제9권8호
    • /
    • pp.18-31
    • /
    • 2009
  • Improved reliability of microarray data and its reproducibility lead to recent increment in demand of data sharing and utilization among laboratories, but house-keeping and publicly opened microarray experimental data can hardly be accessed and utilized since they are in heterogeneous formats according to the various experimental methods and microarray platforms. In this paper, we propose a microarray sharing method which can easily retrieve and integrate microarray data from different experiment platforms, data formats, normalization methods, and analysis methods. Our system is based on web-service technology. The biologists of each site are able to search UDDI(Universal Description, Discovery, and Integration) registry, and download microarray data with common data structure of standard format recommended by MGED(Microarray Gene Expression Databases) society. The common data structure defined in this paper consists of IDF(Investigation Design Format), ADF(Array Design Format), SDRF(Sample and Relationship Format), and EDF(Expression Data Format). These components play role as templates to integrate microarray data with various structure and can be stored in standard formats such as MAGE-ML, MAGE-TAB, and XML Schema. In addition, our system provides advanced tools of automatic microarray data submitter and file manager to manipulate local microarray data efficiently.

Development of Integrated Retrieval System of the Biology Sequence Database Using Web Service (웹 서비스를 이용한 바이오 서열 정보 데이터베이스 및 통합 검색 시스템 개발)

  • Lee, Su-Jung;Yong, Hwan-Seung
    • The KIPS Transactions:PartD
    • /
    • 제11D권4호
    • /
    • pp.755-764
    • /
    • 2004
  • Recently, the rapid development of biotechnology brings the explosion of biological data and biological data host. Moreover, these data are highly distributed and heterogeneous, reflecting the distribution and heterogeneity of the Molecular Biology research community. As a consequence, the integration and interoperability of molecular biology databases are issue of considerable importance. But, up to now, most of the integrated systems such as link based system, data warehouse based system have many problems which are keeping the data up to date when the schema and data of the data source are changed. For this reason, the integrated system using web service technology that allow biological data to be fully exploited have been proposed. In this paper, we built the integrated system if the bio sequence information bated on the web service technology. The developed system allows users to get data with many format such as BSML, GenBank, Fasta to traverse disparate data resources. Also, it has better retrieval performance because the retrieval modules of the external database proceed in parallel.

The Characteristics of Coastal Zone Management Methods in U.S.A -Focus on Zoning & Integrated Methods of Different Kind Data- (미국 연안구역(Coastal Zone) 관리수단의 특성 -조닝방식과 이종 데이터 간 통합방법을 중심으로-)

  • Oh, Ji-Hoon;Lee, Seok-Hwan;Lee, Hee-Won
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • 제11권9호
    • /
    • pp.3590-3598
    • /
    • 2010
  • It is necessary to collect the coastal zone data, to prepare the objective analysis methods, and to build the scientific and technical support system for the efficient management of coastal zone in local aspect. This study analyzes the coastal zoning methods and the integrated methods of different kind data by case study of U.S.A coastal zone. The characteristics of coastal zone management methods in U.S.A are as follows; the concrete indices and methods of establishing coastal zone which can respond to the local values and land use, related data analysis methods for supporting spatial decision making, and establishment and administration of bureau for the spatial information construction and integration of coastal zone. This study suggest the technical implications which can build the domestic coastal zone management in local level on the basis of the common values of the coastal zoning methods and integrated methods of heterogeneous data in U.S.A.

Development of an Organism-specific Protein Interaction Database with Supplementary Data from the Web Sources (다양한 웹 데이터를 이용한 특정 유기체의 단백질 상호작용 데이터베이스 개발)

  • Hwang, Doo-Sung
    • The KIPS Transactions:PartD
    • /
    • 제9D권6호
    • /
    • pp.1091-1096
    • /
    • 2002
  • This paper presents the development of a protein interaction database. The developed system is characterized as follows. First, the proposed system not only maintains interaction data collected by an experiment, but also the genomic information of the protein data. Secondly, the system can extract details on interacting proteins through the developed wrappers. Thirdly, the system is based on wrapper-based system in order to extract the biologically meaningful data from various web sources and integrate them into a relational database. The system inherits a layered-modular architecture by introducing a wrapper-mediator approach in order to solve the syntactic and semantic heterogeneity among multiple data sources. Currently the system has wrapped the relevant data for about 40% of about 11,500 proteins on average from various accessible sources. A wrapper-mediator approach makes a protein interaction data comprehensive and useful with support of data interoperability and integration. The developing database will be useful for mining further knowledge and analysis of human life in proteomics studies.

A Method for Extraction and Loading of Massive Traffic Data using Commercial Tools (상용 도구를 이용한 대용량 교통 데이터의 추출 및 적재 방안)

  • Woo, Chan-Il;Jeon, Se-Gil
    • Journal of Advanced Navigation Technology
    • /
    • 제12권1호
    • /
    • pp.46-53
    • /
    • 2008
  • The ITS(Intelligent Transport System) enables us to provide solutions on traffic problems, while maximizing safety and efficiency of road and transportation systems, by combining technologies from information and communication, electrical engineering, electronics, mechanics, control and instrumentation with transportation systems. The issues that an integration system for massive traffic data sources must face are due to several factors such as the variety and amount of data available, the representational heterogeneity of the data in the different sources, and the autonomy and differing capabilities of the sources. In this paper, we describe how to extract and load of the heterogeneous massive traffic data from the operational databases, such as FTMS and ARTIS using commercial tools. Also, we experiment on traffic data warehouses with integrated quality management techniques for providing high quality data.

  • PDF

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • 제24권2호
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

A Web Services-based Client OLAP API and Its Application to Cube Browsing (웹 서비스 기반의 클라이언트 OLAP API와 큐브 브라우징에의 응용 사례)

  • Bae, Eun-Ju;Kim, Myung
    • The KIPS Transactions:PartD
    • /
    • 제10D권1호
    • /
    • pp.143-152
    • /
    • 2003
  • XML and Web Services draw a lot of attention as standard technologies for data exchange and integration among heterogeneous platforms XML/A, which supports such technologies, is a SOAP based XML APl that facilitates data exchange between a client application and a data analysis engine through the Internet. The fact that the XML format is used for data exchange makes XML/A to be platform-independent. However. client application developers have to go through a tedious Job of treating the same type of XML documents fur downloading data from the server. Also, an XML query language is needed for extracting data from the XML documents sent by the server. In this paper, we present a high level client OLAP API, called DXML, for the client application developers in the windows environment to easily use the OLAP services of XML/A. XMLMD consists of properties and methods needed for OLAP application development. XMLMD is to XML/A what ADOMD is to OLEDB for OLAP. We also present a web OLAP cube browser that is developed using XMLMD. The browser display's data in various formats such as XML, HTML, Excel, and graph.

Configuration of clustering and routing algorithms for energy efficiency by wireless sensor network in ship (선박 내 무선 센서 네트워크에서 에너지 효율을 위한 클러스터링 및 라우팅 알고리즘의 구성)

  • Kim, Mi-jin;Yu, Yun-Sik;Jang, Jong-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 한국정보통신학회 2012년도 추계학술대회
    • /
    • pp.435-438
    • /
    • 2012
  • Today, In all fields, As combination of ubiquitous computing-based technologies between electronic space and physical space, has been active trend research about wireless integration sensor network between sensors and wireless technology. Also, but in ship is underway research about Ship Area Network(SAN) of intelligent ship to integrate wireless technology, ship is required SAN-bridge technology of a variety of wired, wireless network integration and heterogeneous sensor and interoperability of the controller and SAN configuration management technology of remote control. Ship keep safe of all the surrounding environment including crew besides structural safety and freight management monitoring. In this paper, for monitoring design such as on climate change detection and temperature, pressure about various structures, there identify technology trends for routing and data aggregation to use energy efficiency in wireless sensor network. And to analyze self-organizing clustering method, study For wireless sensor network configuration in ship.

  • PDF

A Study on a Multi-sensor Information Fusion Architecture for Avionics (항공전자 멀티센서 정보 융합 구조 연구)

  • Kang, Shin-Woo;Lee, Seoung-Pil;Park, Jun-Hyeon
    • Journal of Advanced Navigation Technology
    • /
    • 제17권6호
    • /
    • pp.777-784
    • /
    • 2013
  • Synthesis process from the data produced by different types of sensor into a single information is being studied and used in a variety of platforms in terms of multi-sensor data fusion. Heterogeneous sensors has been integrated into various aircraft and modern avionic systems manage them. As the performance of sensors in aircraft is getting higher, the integration of sensor information is required from the viewpoint of avionics gradually. Information fusion is not studied widely in the view of software that provide a pilot with fused information from data produced by the sensor in the form of symbology on a display device. The purpose of information fusion is to assist pilots to make a decision in order to perform mission by providing the correct combat situation from avionics of the aircraft and to minimize their workload consequently. In the aircraft avionics equipped with different types of sensors, the software architecture that produce a comprehensive information using the sensor data through multi-sensor data fusion process to the user is shown in this paper.