Browse > Article
http://dx.doi.org/10.9728/dcs.2017.18.1.149

Design of a Platform for Collecting and Analyzing Agricultural Big Data  

Nguyen, Van-Quyet (Department of Electronics and Computer Engineering, Chonnam National University)
Nguyen, Sinh Ngoc (Department of Electronics and Computer Engineering, Chonnam National University)
Kim, Kyungbaek (Department of Electronics and Computer Engineering, Chonnam National University)
Publication Information
Journal of Digital Contents Society / v.18, no.1, 2017 , pp. 149-158 More about this Journal
Abstract
Big data have been presenting us with exciting opportunities and challenges in economic development. For instance, in the agriculture sector, mixing up of various agricultural data (e.g., weather data, soil data, etc.), and subsequently analyzing these data deliver valuable and helpful information to farmers and agribusinesses. However, massive data in agriculture are generated in every minute through multiple kinds of devices and services such as sensors and agricultural web markets. It leads to the challenges of big data problem including data collection, data storage, and data analysis. Although some systems have been proposed to address this problem, they are still restricted either in the type of data, the type of storage, or the size of data they can handle. In this paper, we propose a novel design of a platform for collecting and analyzing agricultural big data. The proposed platform supports (1) multiple methods of collecting data from various data sources using Flume and MapReduce; (2) multiple choices of data storage including HDFS, HBase, and Hive; and (3) big data analysis modules with Spark and Hadoop.
Keywords
Agricultural Big Data Platform; Distributed Systems; Collecting; Analyzing; Storage;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Ferrara, Emilio, et al. "Web data extraction, applications and techniques: a survey." Knowledge-based systems 70 (2014): 301-323.   DOI
2 Geng, Hua, Qiang Gao, and Jingui Pan. "Extracting content for news web pages based on DOM." IJCSNS International Journal of Computer Science and Network Security 7.2 (2007): 124-129.
3 Jonathan Hedley. "Jsoup: Java HTML Parser", https://jsoup.org/
4 Wang, Jie, et al. "The crawling and analysis of agricultural products big data based on Jsoup." Fuzzy Systems and Knowledge Discovery (FSKD), 2015 12th International Conference on. IEEE, 2015.
5 Apache Flume, https://flume.apache.org/.
6 Apache Hadoop, http://hadoop.apache.org (2009).
7 Borthakur, Dhruba. "HDFS architecture guide." HADOOP APACHE PROJECT http://hadoop.apache.org/common/docs/current/hdfs design.pdf(2008):39.
8 Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51.1 (2008): 107-113.   DOI
9 Zaharia, Matei, et al. "Spark: Cluster Computing with Working Sets." HotCloud 10 (2010): 10-10.
10 Gopalani, Satish, and Rohan Arora. "Comparing apache spark and map reduce with performance analysis using K-means." International Journal of Computer Applications 113.1 (2015).
11 Seung-jun Choi, Jae-Won Park, Jong-Bae Kim and Jae-Hyun Choi, "A Quality Evaluation Model for Distributed Processing Systems of Big Data", Journal of Digital Contents Society, Vol. 15, Issue 4, pp 533-545, 2014   DOI