• Title/Summary/Keyword: 대용량의 점데이터

Search Result 130, Processing Time 0.029 seconds

Design and Implementation of an Execution-Provenance Based Simulation Data Management Framework for Computational Science Engineering Simulation Platform (계산과학공학 플랫폼을 위한 실행-이력 기반의 시뮬레이션 데이터 관리 프레임워크 설계 및 구현)

  • Ma, Jin;Lee, Sik;Cho, Kum-won;Suh, Young-kyoon
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.77-86
    • /
    • 2018
  • For the past few years, KISTI has been servicing an online simulation execution platform, called EDISON, allowing users to conduct simulations on various scientific applications supplied by diverse computational science and engineering disciplines. Typically, these simulations accompany large-scale computation and accordingly produce a huge volume of output data. One critical issue arising when conducting those simulations on an online platform stems from the fact that a number of users simultaneously submit to the platform their simulation requests (or jobs) with the same (or almost unchanging) input parameters or files, resulting in charging a significant burden on the platform. In other words, the same computing jobs lead to duplicate consumption computing and storage resources at an undesirably fast pace. To overcome excessive resource usage by such identical simulation requests, in this paper we introduce a novel framework, called IceSheet, to efficiently manage simulation data based on execution metadata, that is, provenance. The IceSheet framework captures and stores each provenance associated with a conducted simulation. The collected provenance records are utilized for not only inspecting duplicate simulation requests but also performing search on existing simulation results via an open-source search engine, ElasticSearch. In particular, this paper elaborates on the core components in the IceSheet framework to support the search and reuse on the stored simulation results. We implemented as prototype the proposed framework using the engine in conjunction with the online simulation execution platform. Our evaluation of the framework was performed on the real simulation execution-provenance records collected on the platform. Once the prototyped IceSheet framework fully functions with the platform, users can quickly search for past parameter values entered into desired simulation software and receive existing results on the same input parameter values on the software if any. Therefore, we expect that the proposed framework contributes to eliminating duplicate resource consumption and significantly reducing execution time on the same requests as previously-executed simulations.

A PageRank based Data Indexing Method for Designing Natural Language Interface to CRM Databases (분석 CRM 실무자의 자연어 질의 처리를 위한 기업 데이터베이스 구성요소 인덱싱 방법론)

  • Park, Sung-Hyuk;Hwang, Kyeong-Seo;Lee, Dong-Won
    • CRM연구
    • /
    • v.2 no.2
    • /
    • pp.53-70
    • /
    • 2009
  • Understanding consumer behavior based on the analysis of the customer data is one essential part of analytic CRM. To do this, the analytic skills for data extraction and data processing are required to users. As a user has various kinds of questions for the consumer data analysis, the user should use database language such as SQL. However, for the firm's user, to generate SQL statements is not easy because the accuracy of the query result is hugely influenced by the knowledge of work-site operation and the firm's database. This paper proposes a natural language based database search framework finding relevant database elements. Specifically, we describe how our TableRank method can understand the user's natural query language and provide proper relations and attributes of data records to the user. Through several experiments, it is supported that the TableRank provides accurate database elements related to the user's natural query. We also show that the close distance among relations in the database represents the high data connectivity which guarantees matching with a search query from a user.

  • PDF

A Study on Efficient Executions of MPI Parallel Programs in Memory-Centric Computer Architecture

  • Lee, Je-Man;Lee, Seung-Chul;Shin, Dongha
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.1-11
    • /
    • 2020
  • In this paper, we present a technique that executes MPI parallel programs, that are developed on processor-centric computer architecture, more efficiently on memory-centric computer architecture without program modification. The technique we present here improves performance by replacing low-speed data communication over the network of MPI library functions with high-speed data communication using the property called fast large shared memory of memory-centric computer architecture. The technique we present in the paper is implemented in two programs. The first program is a modified MPI library called MC-MPI-LIB that runs MPI parallel programs more efficiently on memory-centric computer architecture preserving the semantics of MPI library functions. The second program is a simulation program called MC-MPI-SIM that simulates the performance of memory-centric computer architecture on processor-centric computer architecture. We developed and tested the programs on distributed systems environment deployed on Docker based virtualization. We analyzed the performance of several MPI parallel programs and showed that we achieved better performance on memory-centric computer architecture. Especially we could see very high performance on the MPI parallel programs with high communication overhead.

Design of a Holter Monitoring System with Flash Memory Card (플레쉬 메모리 카드를 이용한 홀터 심전계의 설계)

  • 송근국;이경중
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.3
    • /
    • pp.251-260
    • /
    • 1998
  • The Holter monitoring system is a widely used noninvasive diagnostic tool for ambulatory patient who may be at risk from latent life-threatening cardiac abnormalities. In this paper, we design a high performance intelligent holter monitoring system which is characterized by the small-sized and the low-power consumption. The system hardware consists of one-chip microcontroller(68HC11E9), ECG preprocessing circuit, and flash memory card. ECG preprocessing circuit is made of ECG preamplifier with gain of 250, 500 and 1000, the bandpass filter with bandwidth of 0.05-100Hz, the auto-balancing circuit and the saturation-calibrating circuit to eliminate baseline wandering, ECG signal sampled at 240 samples/sec is converted to the digital signal. We use a linear recursive filter and preprocessing algorithm to detect the ECG parameters which are QRS complex, and Q-R-T points, ST-level, HR, QT interval. The long-term acquired ECG signals and diagnostic parameters are compressed by the MFan(Modified Fan) and the delta modulation method. To easily interface with the PC based analyzer program which is operated in DOS and Windows, the compressed data, that are compatible to FFS(flash file system) format, are stored at the flash memory card with SBF(symmetric block format).

  • PDF

A Study on the 4D Traffic Condition Board based on a Mash-up Technology (Mash-up 기술을 이용한 4D Wall-Map 구성체계)

  • Kim, Joo-Hwan;Yang, Seung-Mook;Nam, Doo-Hee
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.3
    • /
    • pp.27-33
    • /
    • 2009
  • Content used in mashups is typically obtained from a third party source through a public interface or API (web services). Other methods of obtaining content for mashups include Web feeds (e.g. RSS or Atom), and screen scraping. A mashup or meshup Web application has two parts: A new service delivered through a Web page, using its own data and data from other sources. The blended data, made available across the Web through an API or other protocols such as HlTP, RSS, REST, etc. There are many types of mashups, such as consumer mashups, data mashups, and Business Mashups. The most common mashup is the consumer mashup, which are aimed at the general public. Examples include Google Maps, iGuide, and RadioClouds. 4D Wall-map display is data mashups combine similar types of media and information from multiple sources into a single representation. This technology focus data into a single presentation and allow for collaborative action among ITS-related information sources.

  • PDF

The Construction of GIS-based Flood Risk Area Layer Considering River Bight (하천 만곡부를 고려한 GIS 기반 침수지역 레이어 구축)

  • Lee, Geun-Sang;Yu, Byeong-Hyeok;Park, Jin-Hyeog;Lee, Eul-Rae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.12 no.1
    • /
    • pp.1-11
    • /
    • 2009
  • Rapid visualization of flood area of downstream according to the dam effluent in flood season is very important in dam management works. Overlay zone of river bight should be removed to represent flood area efficiently based on flood stage which was modeled in river channels. This study applied drainage enforcement algorithm to visualize flood area considering river bight by coupling Coordinate Operation System for Flood control In Multi-reservoir (COSFIM) and Flood Wave routing model (FLDWAV). The drainage enforcement algorithm is a kind of interpolation which gives to advantage into hydrological process studies by removing spurious sinks of terrain in automatic drainage algorithm. This study presented mapping technique of flood area layer considering river bight in Namgang-Dam downstream, and developed system based on Arcobject component to execute this process automatically. Automatic extraction system of flood area layer could save time-consuming efficiently in flood inundation visualization work which was propelled based on large volume data. Also, flood area layer by coupling with IKONOS satellite image presented real information in flood disaster works.

  • PDF

Structural Segmentation for 3-D Brain Image by Intensity Coherence Enhancement and Classification (명암도 응집성 강화 및 분류를 통한 3차원 뇌 영상 구조적 분할)

  • Kim, Min-Jeong;Lee, Joung-Min;Kim, Myoung-Hee
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.465-472
    • /
    • 2006
  • Recently, many suggestions have been made in image segmentation methods for extracting human organs or disease affected area from huge amounts of medical image datasets. However, images from some areas, such as brain, which have multiple structures with ambiruous structural borders, have limitations in their structural segmentation. To address this problem, clustering technique which classifies voxels into finite number of clusters is often employed. This, however, has its drawback, the influence from noise, which is caused from voxel by voxel operations. Therefore, applying image enhancing method to minimize the influence from noise and to make clearer image borders would allow more robust structural segmentation. This research proposes an efficient structural segmentation method by filtering based clustering to extract detail structures such as white matter, gray matter and cerebrospinal fluid from brain MR. First, coherence enhancing diffusion filtering is adopted to make clearer borders between structures and to reduce the noises in them. To the enhanced images from this process, fuzzy c-means clustering method was applied, conducting structural segmentation by assigning corresponding cluster index to the structure containing each voxel. The suggested structural segmentation method, in comparison with existing ones with clustering using Gaussian or general anisotropic diffusion filtering, showed enhanced accuracy which was determined by how much it agreed with the manual segmentation results. Moreover, by suggesting fine segmentation method on the border area with reproducible results and minimized manual task, it provides efficient diagnostic support for morphological abnormalities in brain.

Comparative Analysis on Cloud and On-Premises Environments for High-Resolution Agricultural Climate Data Processing (고해상도 농업 기후 자료 처리를 위한 클라우드와 온프레미스 비교 분석)

  • Park, Joo Hyeon;Ahn, Mun Il;Kang, Wee Soo;Shim, Kyo-Moon;Park, Eun Woo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.4
    • /
    • pp.347-357
    • /
    • 2019
  • The usefulness of processing and analysis systems of GIS-based agricultural climate data is affected by the reliability and availability of computing infrastructures such as cloud, on-premises, and hybrid. Cloud technology has grown in popularity. However, various reference cases accumulated over the years of operational experiences point out important features that make on-premises technology compatible with cloud technology. Both cloud and on-premises technologies have their advantages and disadvantages in terms of operational time and cost, reliability, and security depending on cases of applications. In this study, we have described characteristics of four general computing platforms including cloud, on-premises with hardware-level virtualization, on-premises with operating system-level virtualization and hybrid environments, and compared them in terms of advantages and disadvantages when a huge amount of GIS-based agricultural climate data were stored and processed to provide public services of agro-meteorological and climate information at high spatial and temporal resolutions. It was found that migrating high-resolution agricultural climate data to public cloud would not be reasonable due to high cost for storing a large amount data that may be of no use in the future. Therefore, we recommended hybrid systems that the on-premises and the cloud environments are combined for data storage and backup systems that incur a major cost, and data analysis, processing and presentation that need operational flexibility, respectively.

A Study on the Mapping of Fishing Activity using V-Pass Data - Focusing on the Southeast Sea of Korea - (선박패스(V-Pass) 자료를 활용한 어업활동 지도 제작 연구 - 남해동부해역을 중심으로 -)

  • HAN, Jae-Rim;KIM, Tae-Hoon;CHOI, Eun Yeong;CHOI, Hyun-Woo
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.1
    • /
    • pp.112-125
    • /
    • 2021
  • Marine spatial planning(MSP) designates the marine as nine kinds of use zones for the systematic and rational management of marine spaces. One of them is the fishery protection zone, which is necessary for the sustainable production of fishery products, including the protection and fosterage of fishing activities. This study intends to quantitatively identify the fishing activity space, one of the elements necessary for the designation of fisheries protection zones, by mapping of fishery activities using V-Pass data and deriving the fishery activity concentrated zone. To this end, pre-processing of V-Pass data was performed, such as constructing a dataset that combines static and dynamic information, calculating the speed of fishing vessels, extracting fishing activity points, and removing data in non-fishing activity zone. Finally, using the selected V-Pass point data, a fishery activity map was made by kernel density estimation, and the concentrated space of fishery activity was analyzed. In addition, it was confirmed that there is a difference in the spatial distribution of fishing activities according to the type of fishing vessel and the season. The pre-processing technique of large volume V-Pass data and the mapping method of fishing activities performed through this study are expected to contribute to the study of spatial characteristics evaluation of fishing activities in the future.

A Study on the Demand for Cultural Ecosystem Services in Urban Forests Using Topic Modeling (토픽모델링을 활용한 도시림의 문화서비스 수요 특성 분석)

  • Kim, Jee-Young;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.4
    • /
    • pp.37-52
    • /
    • 2022
  • The purpose of this study is to analyze the demand for cultural ecosystem services in urban forests based on user perception and experience value by using Naver blog posts and LDA topic modeling. Bukhansan National Park was used to analyze and review the feasibility of spatial assessments. Based on the results of topic modeling from blog posts, a review process was conducted considering the relevance of Bukhansan National Park's cultural services and its suitability as a spatial assessment case, and finally, an index for the spatial assessment of urban forest's cultural service was derived. Specifically, 21 topics derived through topic analysis were interpreted, and 13 topics related to cultural ecosystem services were derived based on the MA(Millennium Ecosystem Assessment)'s classification system for ecosystem services. 72.7% of all documents reviewed had data deemed useful for this study. The contents of the topic fell into one of the seven types of cultural services related to "mountainous recreation activities" (23.7%), "indirect use value linked to tourism and convenience facilities" (12.4%), "inspirational activities" (11.2%), "seasonal recreation activities" (6.2%), "natural appreciation and static recreation activities" (3.7%). Next, for the 13 cultural service topics derived from data gathered about Bukhansan National Park, the possibility of spatial assessment of the characteristics of cultural ecosystem services provided by urban forests was reviewed, and a total of 8 cultural service indicators were derived. The MA's cultural service classification system for ecosystem services, which was widely used in previous studies, has limitations in that it does not reflect the actual user demand of urban forests, but it is meaningful in that it categorizes cultural service indicators suitable for domestic circumstances. In addition, the study is significant as it presented a methodology to interpret and derive the demand for cultural services using a large amount of user awareness and experience data.