[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5626/KTCP.2018.24.1.24

Development of Big Data System for Energy Big Data

Song, Mingoo (Seoul Nat'l Univ.)

Publication Information

KIISE Transactions on Computing Practices / v.24, no.1, 2018 , pp. 24-32 More about this Journal

Abstract

This paper proposes a Big Data system for energy Big Data which is aggregated in real-time from industrial and public sources. The constructed Big Data system is based on Hadoop and the Spark framework is simultaneously applied on Big Data processing, which supports in-memory distributed computing. In the paper, we focus on Big Data, in the form of heat energy for district heating, and deal with methodologies for storing, managing, processing and analyzing aggregated Big Data in real-time while considering properties of energy input and output. At present, the Big Data influx is stored and managed in accordance with the designed relational database schema inside the system and the stored Big Data is processed and analyzed as to set objectives. The paper exemplifies a number of heat demand plants, concerned with district heating, as industrial sources of heat energy Big Data gathered in real-time as well as the proposed system.

Keywords

Big Data; Big Data system; energy; district heating; Spark; Hadoop;

Citations & Related Records

Reference

1	Apache Hadoop, [Online] Available: https://hadoop.apache.org/
2	Apache Spark, [Online] Available: https://spark.apache.org/
3	A. Pavlo et al., A Comparison of Approaches to Large-Scale Data Analysis, Proc. of the ACM SIGMOD International Conference on Management of Data, pp. 165-178, 2009.
4	J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, OSDI, 2004.
5	A. Abouzeid et al., HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads, Proc. of the VLDB Endowment, pp. 922-933, 2009.
6	M. Zaharia et al., Spark: Cluster Computing with Working Sets, HotCloud, pp. 10-10, 2010.
7	M. Song, "Development of Heat Demand Management System for District Heating based on Big Data Platform," Communications of the Korean Institute of Information Scientists and Engineers, pp. 31-33, 2017.
8	Apache HDFS, [Online] Available: https://hortonworks.com/apache/hdfs/
9	M. Zaharia et al., Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Proc. of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2-2, 2012.
10	Apache Kafka, [Online] Available: https://kafka.apache.org/
11	Apache Flume, [Online] Available: https://flume.apache.org/
12	Apache Sqoop, [Online] Available: https://sqoop.apache.org/
13	Apache HBase, [Online] Available: https://hbase.apache.org/
14	M. Stonebraker, "SQL databases v. NoSQL databases," Communications of the ACM, Vol. 53, No. 4, pp. 10-11, 2010. DOI
15	Apache Zookeeper, [Online] Available: https://zookeper.apache.org/
16	S. Venkataraman et al., SparkR: Scaling R Programs with Spark, Proc. of the ACM SIGMOD International Conference on Management of Data, pp. 1099-1104, 2016.
17	Apache Oozie, [Online] Available: https://oozie.apache.org/
18	Spark Streaming, [Online] Available: http://spark.apache.org/streaming/
19	Apache Spark SQL, [Online] Available: http://spark.apache.org/sql/
20	GraphX, [Online] Available: http://spark.apache.org/graphx/
21	MLlib, [Online] Available: http://spark.apache.org/mllib/
22	Apache Thrift, [Online] Available: https://thrift.apache.org/
23	Apache Hadoop Yarn, [Online] Available: https://hortonworks.com/apache/yarn/
24	T. Ivanov and S. Izberovic, Evaluating Hadoop Clusters with TPCx-HS, arXiv: 1509.03486, 2015.
25	TPCx-HS, [Online] Available: https://www.tpc.org/tpcx-hs/
26	R. Nambiar et al., Introducing TPCx-HS: The First Industry Standard for Benchmarking Big Data Systems, Performance Characterization and Benchmarking, Traditional to Big Data, Springer, pp. 1-12, 2014.
27	O. O'Malley, TeraByte Sort on Apache Hadoop, [Online] Available: http://sortbenchmart.org/Yahoo-Hadoop.pdf, pp. 1-3, 2008.
28	S. Y. Wu et al., "Exergy Transfer Effectiveness on Heat Exchanger for Finite Pressure Drop, Energy, pp. 2110-2120, 2007.
29	Tensorflow, [Online] Available: https://www.tensorflow.org/

KSCI

Development of Big Data System for Energy Big Data 에너지 빅데이터를 수용하는 빅데이터 시스템 개발

Development of Big Data System for Energy Big Data