• Title/Summary/Keyword: Big data storage

Search Result 203, Processing Time 0.022 seconds

Big Data Security and Privacy: A Taxonomy with Some HPC and Blockchain Perspectives

  • Alsulbi, Khalil;Khemakhem, Maher;Basuhail, Abdullah;Eassa, Fathy;Jambi, Kamal Mansur;Almarhabi, Khalid
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.43-55
    • /
    • 2021
  • The amount of Big Data generated from multiple sources is continuously increasing. Traditional storage methods lack the capacity for such massive amounts of data. Consequently, most organizations have shifted to the use of cloud storage as an alternative option to store Big Data. Despite the significant developments in cloud storage, it still faces many challenges, such as privacy and security concerns. This paper discusses Big Data, its challenges, and different classifications of security and privacy challenges. Furthermore, it proposes a new classification of Big Data security and privacy challenges and offers some perspectives to provide solutions to these challenges.

Implementation and Performance Aanalysis of Efficient Big Data Processing System Through Dynamic Configuration of Edge Server Computing and Storage Modules (BigCrawler: 엣지 서버 컴퓨팅·스토리지 모듈의 동적 구성을 통한 효율적인 빅데이터 처리 시스템 구현 및 성능 분석)

  • Kim, Yongyeon;Jeon, Jaeho;Kang, Sungjoo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.6
    • /
    • pp.259-266
    • /
    • 2021
  • Edge Computing enables real-time big data processing by performing computing close to the physical location of the user or data source. However, in an edge computing environment, various situations that affect big data processing performance may occur depending on temporary service requirements or changes of physical resources in the field. In this paper, we proposed a BigCrawler system that dynamically configures the computing module and storage module according to the big data collection status and computing resource usage status in the edge computing environment. And the feature of big data processing workload according to the arrangement of computing module and storage module were analyzed.

Design of a Platform for Collecting and Analyzing Agricultural Big Data (농업 빅데이터 수집 및 분석을 위한 플랫폼 설계)

  • Nguyen, Van-Quyet;Nguyen, Sinh Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.149-158
    • /
    • 2017
  • Big data have been presenting us with exciting opportunities and challenges in economic development. For instance, in the agriculture sector, mixing up of various agricultural data (e.g., weather data, soil data, etc.), and subsequently analyzing these data deliver valuable and helpful information to farmers and agribusinesses. However, massive data in agriculture are generated in every minute through multiple kinds of devices and services such as sensors and agricultural web markets. It leads to the challenges of big data problem including data collection, data storage, and data analysis. Although some systems have been proposed to address this problem, they are still restricted either in the type of data, the type of storage, or the size of data they can handle. In this paper, we propose a novel design of a platform for collecting and analyzing agricultural big data. The proposed platform supports (1) multiple methods of collecting data from various data sources using Flume and MapReduce; (2) multiple choices of data storage including HDFS, HBase, and Hive; and (3) big data analysis modules with Spark and Hadoop.

Scalable Blockchain Storage Model Based on DHT and IPFS

  • Chen, Lu;Zhang, Xin;Sun, Zhixin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2286-2304
    • /
    • 2022
  • Blockchain is a distributed ledger that combines technologies such as cryptography, consensus mechanism, peer-to-peer transmission, and time stamping. The rapid development of blockchain has attracted attention from all walks of life, but storage scalability issues have hindered the application of blockchain. In this paper, a scalable blockchain storage model based on Distributed Hash Table (DHT) and the InterPlanetary File System (IPFS) was proposed. This paper introduces the current research status of the scalable blockchain storage model, as well as the basic principles of DHT and the InterPlanetary File System. The model construction and workflow are explained in detail. At the same time, the DHT network construction mechanism, block heat identification mechanism, new node initialization mechanism, and block data read and write mechanism in the model are described in detail. Experimental results show that this model can reduce the storage burden of nodes, and at the same time, the blockchain network can accommodate more local blocks under the same block height.

Big data platform for health monitoring systems of multiple bridges

  • Wang, Manya;Ding, Youliang;Wan, Chunfeng;Zhao, Hanwei
    • Structural Monitoring and Maintenance
    • /
    • v.7 no.4
    • /
    • pp.345-365
    • /
    • 2020
  • At present, many machine leaning and data mining methods are used for analyzing and predicting structural response characteristics. However, the platform that combines big data analysis methods with online and offline analysis modules has not been used in actual projects. This work is dedicated to developing a multifunctional Hadoop-Spark big data platform for bridges to monitor and evaluate the serviceability based on structural health monitoring system. It realizes rapid processing, analysis and storage of collected health monitoring data. The platform contains offline computing and online analysis modules, using Hadoop-Spark environment. Hadoop provides the overall framework and storage subsystem for big data platform, while Spark is used for online computing. Finally, the big data Hadoop-Spark platform computational performance is verified through several actual analysis tasks. Experiments show the Hadoop-Spark big data platform has good fault tolerance, scalability and online analysis performance. It can meet the daily analysis requirements of 5s/time for one bridge and 40s/time for 100 bridges.

Development of scalable big data storage system using network computing technology (네트워크 컴퓨팅 기술을 활용한 확장 가능형 빅데이터 스토리지 시스템 개발)

  • Park, Jung Kyu;Park, Eun Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1330-1336
    • /
    • 2019
  • As the Fourth Industrial Revolution era began, a variety of devices are running on the cloud. These various devices continue to generate various types of data or large amounts of multimedia data. To handle this situation, a large amount of storage is required, and big data technology is required to process stored data and obtain accurate information. NAS (Network Attached Storage) or SAN (Storage Area Network) technology is typically used to build high-speed, high-capacity storage in a network-based environment. In this paper, we propose a method to construct a mass storage device using Network-DAS which is an extension technology of DAS (Direct Attached Storage). Benchmark experiments were performed to verify the scalability of the storage system with 76 HDD. Experimental results show that the proposed high performance mass storage system is scalable and reliable.

Fishery R&D Big Data Platform and Metadata Management Strategy (수산과학 빅데이터 플랫폼 구축과 메타 데이터 관리방안)

  • Kim, Jae-Sung;Choi, Youngjin;Han, Myeong-Soo;Hwang, Jae-Dong;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.93-103
    • /
    • 2019
  • In this paper, we introduce a big data platform and a metadata management technique for fishery science R & D information. The big data platform collects and integrates various types of fisheries science R & D information and suggests how to build it in the form of a data lake. In addition to existing data collected and accumulated in the field of fisheries science, we also propose to build a big data platform that supports diverse analysis by collecting unstructured big data such as satellite image data, research reports, and research data. Next, by collecting and managing metadata during data extraction, preprocessing and storage, systematic management of fisheries science big data is possible. By establishing metadata in a standard form along with the construction of a big data platform, it is meaningful to suggest a systematic and continuous big data management method throughout the data lifecycle such as data collection, storage, utilization and distribution.

  • PDF

The Analyzing Risk Factor of Big Data : Big Data Processing Perspective (빅데이터 처리 프로세스에 따른 빅데이터 위험요인 분석)

  • Lee, Ji-Eun;Kim, Chang-Jae;Lee, Nam-Yong
    • Journal of Information Technology Services
    • /
    • v.13 no.2
    • /
    • pp.185-194
    • /
    • 2014
  • Recently, as value for practical use of big data is evaluated, companies and organizations that create benefit and profit are gradually increasing with application of big data. But specifical and theoretical study about possible risk factors as introduction of big data is not being conducted. Accordingly, the study extracts the possible risk factors as introduction of big data based on literature reviews and classifies according to big data processing, data collection, data storage, data analysis, analysis data visualization and application. Also, the risk factors have order of priority according to the degree of risk from the survey of experts. This study will make a chance that can avoid risks by bid data processing and preparation for risks in order of dangerous grades of risk.

Implement of MapReduce-based Big Data Processing Scheme for Reducing Big Data Processing Delay Time and Store Data (빅데이터 처리시간 감소와 저장 효율성이 향상을 위한 맵리듀스 기반 빅데이터 처리 기법 구현)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.13-19
    • /
    • 2018
  • MapReduce, the Hadoop's essential core technology, is most commonly used to process big data based on the Hadoop distributed file system. However, the existing MapReduce-based big data processing techniques have a feature of dividing and storing files in blocks predefined in the Hadoop distributed file system, thus wasting huge infrastructure resources. Therefore, in this paper, we propose an efficient MapReduce-based big data processing scheme. The proposed method enhances the storage efficiency of a big data infrastructure environment by converting and compressing the data to be processed into a data format in advance suitable for processing by MapReduce. In addition, the proposed method solves the problem of the data processing time delay arising from when implementing with focus on the storage efficiency.

Evaluation of the Relationship between Meteorological, Agricultural and In-situ Big Data Droughts (기상학적 가뭄, 농업 가뭄 및 빅데이터 현장가뭄간의 상관성 평가)

  • LEE, Ji-Wan;JANG, Sun-Sook;AHN, So-Ra;PARK, Ki-Wook;KIM, Seong-Joon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.1
    • /
    • pp.64-79
    • /
    • 2016
  • The purpose of this study is to find the relationship between precipitation deficit, SPI(standardized precipitation index)-12 month, agricultural reservoir water storage deficit and agricultural drought-related big data, and to evaluate the usefulness of agricultural risk management through big data. For the long term drought (from January 2014 to September 2015), each data was collected and analysed with monthly and Provincial base. The minimum SPI-12 and maximum reservoir water storage deficit compared to normal year were occurred at the same time of July 2014, and August and September 2015. The maximum frequency of big data was occurred at June and July of 2014, and March and June to September of 2015. The maximum big data was occurred 1 month advanced in 2014 and 2 months advanced in 2015 than the maximum reservoir water storage deficit. The occurrence of big data was sensitive to spring drought from March, late Jangma of June, dry Jangma of July and the rainfall deficit of September 2015. The big data was closely related with the meteorological drought and agricultural drought. Because the big data is the in situ feeling drought, it is proved as a useful indicator for agricultural risk management.