• Title/Summary/Keyword: Data Lake

Search Result 448, Processing Time 0.028 seconds

Draft Design of DataLake Framework based on Abyss Storage Cluster (Abyss Storage Cluster 기반의 DataLake Framework의 설계)

  • Cha, ByungRae;Park, Sun;Shin, Byeong-Chun;Kim, JongWon
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.9-15
    • /
    • 2018
  • As an organization or organization grows in size, many different types of data are being generated in different systems. There is a need for a way to improve efficiency by processing data smarter in different systems. Just like DataLake, we are creating a single domain model that accurately describes the data and can represent the most important data for the entire business. In order to realize the benefits of a DataLake, it is import to know how a DataLake may be expected to work and what components architecturally may help to build a fully functional DataLake. DataLake components have a life cycle according to the data flow. And while th data flows into a DataLake from the point of acquisition, its meta-data is captured and managed along with data traceability, data lineage, and security aspects based on data sensitivity across its life cycle. According to this reason, we have designed the DataLake Framework based on Abyss Storage Cluster.

Design and Verification of Connected Data Architecture Concept employing DataLake Framework over Abyss Storage Cluster (Abyss Storage Cluster 기반 DataLake Framework의 Connected Data Architecture 개념 설계 및 검증)

  • Cha, ByungRae;Cha, Yun-Seok;Park, Sun;Shin, Byeong-Chun;Kim, JongWon
    • Smart Media Journal
    • /
    • v.7 no.3
    • /
    • pp.57-63
    • /
    • 2018
  • With many types of data generated in the shift of business environment as a result of growth of an organization or enterprise, there is a need to improve the data-processing efficiency in smarter means with a single domain model such as Data Lake. In particular, creating a logical single domain model from physical partitioned multi-site data by the finite resources of nature and shared economy is very important in terms of efficient operation of computing resources. Based on the advantages of the existing Data Lake framework, we define the CDA-Concept (connected data architecture concept) and functions of Data Lake Framework over Abyss Storage for integrating multiple sites in various application domains and managing the data lifecycle. Also, it performs the interface design and validation verification for Interface #2 & #3 of the connected data architecture-concept.

A Study on the Relationship between Cyanobacteria and Environmental Factors in Yeongcheon Lake (영천호에서 남조류 발생과 환경요인의 관련성 연구)

  • Lee, Hyeon-Mi;Shin, Ra-Young;Lee, Jung-Ho;Park, Jong-geun
    • Journal of Korean Society on Water Environment
    • /
    • v.35 no.4
    • /
    • pp.352-361
    • /
    • 2019
  • The purpose of this study is to analyze the characteristics and correlations of the Yeongcheon Lake in order to reduce the occurrence of harmful cyanobacteria. In this study, we investigated the water quality and phytoplankton of the lake from May to November in 2017. Correlation and data mining analyses were performed to analyze the relationship between the two factors. The water temperature was lowest at the point where the Yeongcheon Lake inflow occurs at Imha Lake. It was highest at the point where the outflow occurs to Angye Lake. The pH was also highest at the outflow point, but in the case of DO, it was highest at the midpoint between the inflow and outflow. The main cyanobacteria that emerged during the study period were Oscillatorialimosa, Microcysti saeruginosa and Aphanizomenon flos-aquae. As a result of correlation analysis, the water temperature, inflow, COD loading, TOC loading at the inflow point of the Yeongcheon Lake were the items that were related to the harmful cyanobacteria. The data mining analysis indicated that the TP loading and harmful cyanobacteria in the inflow point of the Yeongcheon Lake were influential on the detrimental cyanobacteria in the Yeongcheon Lake outflow point. When the TP loading was less than 39.0 kg/day at the inflow site, it was expected that the amount of harmful cyanobacteria could be maintained below 10,000 cells/mL.

Behavior of Water Quality in Freshwater Lake of Tide Reclaimed Area Using SWMM and WASP5 Models (SWMM과 WASP5모형을 이용한 간척지 담수호의 수질거동 특성 조사)

  • 김선주;김성준;이석호;이준우
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.44 no.2
    • /
    • pp.148-160
    • /
    • 2002
  • Lake water quality assessment information is useful to anyone involved in lake management, from lakeshore owners to lake associations. 11 provides lake water quality, which can improve how to manage lake resources and how to measure current conditions. It also provides a knowledge base that can be used to protect and restore lakes. SWMM was applied to simulate the discharge and pollutant loads from Boryeong watershed, and WASP5 was applied to analyze the changes of water quality in Boryeong freshwater lake. In each model, the most suitable parameters were calculated through sensitive analysis and some parameters used default data. Simulated in SWMM and measured discharge showed the accuracy of 88.6%. T-N and T-P exceeds the criteria in the simulation of water quality in Boryeong freshwater lake, and control of pollutant loads in the main stream showed the most effective way. Integrated water quality management system was developed to give convenience in the operation of SWMM and WASP5 and data acquisition.

A Leading Study of Data Lake Platform based on Big Data to support Business Intelligence (Business Intelligence를 지원하기 위한 Big Data 기반 Data Lake 플랫폼의 선행 연구)

  • Lee, Sang-Beom
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.01a
    • /
    • pp.31-34
    • /
    • 2018
  • We live in the digital era, and the characteristics of our customers in the digital era are constantly changing. That's why understanding business requirements and converting them to technical requirements is essential, and you have to understand the data model behind the business layout. Moreover, BI(Business Intelligence) is at the crux of revolutionizing enterprise to minimize losses and maximize profits. In this paper, we have described a leading study about the situation of desk-top BI(software product & programming language) in aspect of front-end side and the Data Lake platform based on Big Data by data modeling in aspect of back-end side to support the business intelligence.

  • PDF

Draft Design of AI Services through Concept Extension of Connected Data Architecture (Connected Data Architecture 개념의 확장을 통한 AI 서비스 초안 설계)

  • Cha, ByungRae;Park, Sun;Oh, Su-Yeol;Kim, JongWon
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.30-36
    • /
    • 2018
  • Single domain model like DataLake framework is in spotlight because it can improve data efficiency and process data smarter in big data environment, where large scaled business system generates huge amount of data. In particular, efficient operation of network, storage, and computing resources in logical single domain model is very important for physically partitioned multi-site data process. Based on the advantages of Data Lake framework, we define and extend the concept of Connected Data Architecture and functions of DataLake framework for integrating multiple sites in various domains and managing the lifecycle of data. Also, we propose the design of CDA-based AI service and utilization scenarios in various application domain.

Estimation of Water Quality of Geumgang Lake by Diversion of Geumgang Lake Flow into Saemangeum Lake (금강호물의 새만금호 도입에 따른 금강호 수질변화 분석)

  • Eom, Myung-Chul;Lee, Jae-Myun
    • Journal of Korean Society on Water Environment
    • /
    • v.22 no.6
    • /
    • pp.1045-1051
    • /
    • 2006
  • Geumgang canal is planned to connect Geumgang lake with Saemangeum lake to accelerate desalinization and dilute polluted water to improve water quality in Saemangeum lake. The purpose of this study is to evaluate the impact of water quality on Geumgang lake by diversion of its lake flow to Saemangeum lake. WASP5 model was used to estimate water quality of Geumgang lake. Model calibration and verification were done for water quality data for 2001 and 2002. Water quality concentrations in Geumgang lake were simulated for 4 scenarios, which were considered whether the Geumgang canal is built or not. As a result of simulations, there was little impact on water quality in Geumgang lake, though a little of the Geumgang lake flow diverted to Saemangeum lake. As the Geumgang canal is planned to divert the Geumgang water flow which were discharged into the sea through sluice gates when canal is not built, it is thought that there will be little change by diversion of water flow.

Lake Water Quality Modelling Considering Rainfall-Runoff Pollution Loads (강우유출오염부하를 고려한 호수수질모델링)

  • Cho, Jae-Heon;Kang, Sung-Hyo
    • Journal of Environmental Impact Assessment
    • /
    • v.18 no.2
    • /
    • pp.59-67
    • /
    • 2009
  • Water quality of the Lake Youngrang in the Sokcho City is eutrophic. Jangcheon is the largest inflow source to the lake. Major pollutant sources are stormwater runoff from resort areas and various land uses in the Jangcheon watershed. A storm sewer on the southern end of the lake is also an important pollution source. In this study, water quality modelling for Lake Youngrang was carried out considering the rainfall-runoff pollution loads from the watershed. The rainfall-runoff curves and the rainfall-runoff pollutant load curves were derived from the rainfall-runoff survey data during the recent 4 years. The rainfall-runoff pollution loads and flow from the Jangcheon watershed and the storm sewer were estimated using the two kinds of curves, and they were used as the flow and the boundary data of the WASP model. With the measured water quality data of the year 2005 and 2006, WASP model was calibrated. Non-point pollution control measures such as wet pond and infiltration trench were considered as the alternative for water quality management of the lake. The predicted water quality were compared with those under the present condition, and the improvement effect of the lake water quality were analyzed.

Development and Application of Freshwater Lake Water Quality Management System(ELAQUM) through the Linkage of Watershed and Freshwater Lake (유역과 담수호를 연계한 담수호 수질관리 시스템 개발 및 적용)

  • 김선주;김성준;김필식
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.44 no.6
    • /
    • pp.124-136
    • /
    • 2002
  • A freshwater lake water quality management system(FLAQUM) was developed to help regional manager for the water quality of a rural basin. The integrated user interface system FLAQUM written in Visual Basic, includes three subsystems such as a database management system, basin pollutant loads simulation model using SWMM model and freshwater lake water quality simulation model using WASP5 model. Pollutant load simulation model was applied to simulate the discharge and pollutant loading from the watershed, and freshwater lake water quality model was applied to analyze the changes in water quality with respect to watershed pollutant loads, and this model could be used in planning to control watershed pollutant source for water quality management. Database management system was constructed fur all input and output data processing, and it can be used to analyze statistical characteristics using constructed data. Results are displayed both graph and text for convenience of user. The results of FLAQUM application to Boryeong freshwater lake showed that the lake was in eutrophic condition. The major contribution of pollution comes from tributary No.1 and No.4, which have a large number of livestock farms. Therefore, water quality management must be focused on appropriate management of the livestock farming in the two breanchs.

Hydrological Variability of Lake Chad using Satellite Gravimetry, Altimetry and Global Hydrological Models

  • Buma, Willibroad Gabila;Seo, Jae Young;Lee, Sang-IL
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.467-467
    • /
    • 2015
  • Sustainable water resource management requires the assessment of hydrological variability in response to climate fluctuations and anthropogenic activities. Determining quantitative estimates of water balance and total basin discharge are of utmost importance to understand the variations within a basin. Hard-to-reach areas with few infrastructures, coupled with lengthy administrative procedures makes in-situ data collection and water management processes very difficult and unreliable. In this study, the hydrological behavior of Lake Chad whose extent, extreme climatic and environmental conditions make it difficult to collect field observations was examined. During a 10 year period [January 2003 to December 2013], dataset from space-borne and global hydrological models observations were analyzed. Terrestial water storage (TWS) data retrieved from Gravity Recovery and Climate Experiment (GRACE), lake level variations from Satellite altimetry, water fluxes and soil moisture from Global Land Data Assimilation System (GLDAS) were used for this study. Furthermore, we combined altimetry lake volume with TWS over the lake drainage basin to estimate groundwater and soil moisture variations. This will be validated with groundwater estimates from WaterGAP Global Hydrology Model (WGHM) outputs. TWS showed similar variation patterns Lake water level as expected. The TWS in the basin area is governed by the lake's surface water. As expected, rainfall from GLDAS precedes GRACE TWS with a phase lag of about 1 month. Estimates of groundwater and soil moisture content volume changes derived by combining altimetric Lake Volume with TWS over the drainage basin are ongoing. Results obtained shall be compared with WaterGap Hydrology Model (WGHM) groundwater estimate outputs.

  • PDF