• 제목/요약/키워드: Big data storage

Search Result 205, Processing Time 0.02 seconds

Asymmetric data storage management scheme to ensure the safety of big data in multi-cloud environments based on deep learning (딥러닝 기반의 다중 클라우드 환경에서 빅 데이터의 안전성을 보장하기 위한 비대칭 데이터 저장 관리 기법)

  • Jeong, Yoon-Su
    • Journal of Digital Convergence
    • /
    • v.19 no.3
    • /
    • pp.211-216
    • /
    • 2021
  • Information from various heterogeneous devices is steadily increasing in distributed cloud environments. This is because high-speed network speeds and high-capacity multimedia data are being used. However, research is still underway on how to minimize information errors in big data sent and received by heterogeneous devices. In this paper, we propose a deep learning-based asymmetric storage management technique for minimizing bandwidth and data errors in networks generated by information sent and received in cloud environments. The proposed technique applies deep learning techniques to optimize the load balance after asymmetric hash of the big data information generated by each device. The proposed technique is characterized by allowing errors in big data collected from each device, while also ensuring the connectivity of big data by grouping big data into groups of clusters of dogs. In particular, the proposed technique minimizes information errors when storing and managing big data asymmetrically because it used a loss function that extracted similar values between big data as seeds.

An Efficient Information Retrieval System for Unstructured Data Using Inverted Index

  • Abdullah Iftikhar;Muhammad Irfan Khan;Kulsoom Iftikhar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.7
    • /
    • pp.31-44
    • /
    • 2024
  • The inverted index is combination of the keywords and posting lists associated for indexing of document. In modern age excessive use of technology has increased data volume at a very high rate. Big data is great concern of researchers. An efficient Document indexing in big data has become a major challenge for researchers. All organizations and web engines have limited number of resources such as space and storage which is very crucial in term of data management of information retrieval system. Information retrieval system need to very efficient. Inverted indexing technique is introduced in this research to minimize the delay in retrieval of data in information retrieval system. Inverted index is illustrated and then its issues are discussed and resolve by implementing the scalable inverted index. Then existing algorithm of inverted compared with the naïve inverted index. The Interval list of inverted indexes stores on primary storage except of auxiliary memory. In this research an efficient architecture of information retrieval system is proposed particularly for unstructured data which don't have a predefined structure format and data volume.

A Study on the NAS Storage-based Data Distributed Processing System Algorithm (NAS 스토리지 기반의 데이터 분산처리 시스템 알고리즘에 관한 연구)

  • Jang, Jae-Myung;Kang, Hee-beom;Jeong, Nahk-ju;Jung, Hoe-kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.643-645
    • /
    • 2015
  • Real life has been actively utilizing storage from anywhere automobiles and Aviation field etc to the development of storage. Recent Big Data is stored as a number of data storage and data distribution processing emerged in research has been actively conducted to process the data. But many bottlenecks or processing speed slows down when you request the data at the same time a problem arises. In this paper consider should be used in the field of big data storing and processing large amounts of data, the process data more efficiently when the number of data request and data suggest that the weight-effective, manageable data processing system algorithm.

  • PDF

Dynamic Data Migration in Hybrid Main Memories for In-Memory Big Data Storage

  • Mai, Hai Thanh;Park, Kyoung Hyun;Lee, Hun Soon;Kim, Chang Soo;Lee, Miyoung;Hur, Sung Jin
    • ETRI Journal
    • /
    • v.36 no.6
    • /
    • pp.988-998
    • /
    • 2014
  • For memory-based big data storage, using hybrid memories consisting of both dynamic random-access memory (DRAM) and non-volatile random-access memories (NVRAMs) is a promising approach. DRAM supports low access time but consumes much energy, whereas NVRAMs have high access time but do not need energy to retain data. In this paper, we propose a new data migration method that can dynamically move data pages into the most appropriate memories to exploit their strengths and alleviate their weaknesses. We predict the access frequency values of the data pages and then measure comprehensively the gains and costs of each placement choice based on these predicted values. Next, we compute the potential benefits of all choices for each candidate page to make page migration decisions. Extensive experiments show that our method improves over the existing ones the access response time by as much as a factor of four, with similar rates of energy consumption.

Image Deduplication Based on Hashing and Clustering in Cloud Storage

  • Chen, Lu;Xiang, Feng;Sun, Zhixin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.4
    • /
    • pp.1448-1463
    • /
    • 2021
  • With the continuous development of cloud storage, plenty of redundant data exists in cloud storage, especially multimedia data such as images and videos. Data deduplication is a data reduction technology that significantly reduces storage requirements and increases bandwidth efficiency. To ensure data security, users typically encrypt data before uploading it. However, there is a contradiction between data encryption and deduplication. Existing deduplication methods for regular files cannot be applied to image deduplication because images need to be detected based on visual content. In this paper, we propose a secure image deduplication scheme based on hashing and clustering, which combines a novel perceptual hash algorithm based on Local Binary Pattern. In this scheme, the hash value of the image is used as the fingerprint to perform deduplication, and the image is transmitted in an encrypted form. Images are clustered to reduce the time complexity of deduplication. The proposed scheme can ensure the security of images and improve deduplication accuracy. The comparison with other image deduplication schemes demonstrates that our scheme has somewhat better performance.

Big Data Meets Telcos: A Proactive Caching Perspective

  • Bastug, Ejder;Bennis, Mehdi;Zeydan, Engin;Kader, Manhal Abdel;Karatepe, Ilyas Alper;Er, Ahmet Salih;Debbah, Merouane
    • Journal of Communications and Networks
    • /
    • v.17 no.6
    • /
    • pp.549-557
    • /
    • 2015
  • Mobile cellular networks are becoming increasingly complex to manage while classical deployment/optimization techniques and current solutions (i.e., cell densification, acquiring more spectrum, etc.) are cost-ineffective and thus seen as stopgaps. This calls for development of novel approaches that leverage recent advances in storage/memory, context-awareness, edge/cloud computing, and falls into framework of big data. However, the big data by itself is yet another complex phenomena to handle and comes with its notorious 4V: Velocity, voracity, volume, and variety. In this work, we address these issues in optimization of 5G wireless networks via the notion of proactive caching at the base stations. In particular, we investigate the gains of proactive caching in terms of backhaul offloadings and request satisfactions, while tackling the large-amount of available data for content popularity estimation. In order to estimate the content popularity, we first collect users' mobile traffic data from a Turkish telecom operator from several base stations in hours of time interval. Then, an analysis is carried out locally on a big data platformand the gains of proactive caching at the base stations are investigated via numerical simulations. It turns out that several gains are possible depending on the level of available information and storage size. For instance, with 10% of content ratings and 15.4Gbyte of storage size (87%of total catalog size), proactive caching achieves 100% of request satisfaction and offloads 98% of the backhaul when considering 16 base stations.

A study on development method for practical use of Big Data related to recommendation to financial item (금융 상품 추천에 관련된 빅 데이터 활용을 위한 개발 방법)

  • Kim, Seok-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.8
    • /
    • pp.73-81
    • /
    • 2014
  • This study proposed development method for practical use techniques compromise data storage layer, data processing layer, data analysis layer, visualization layer. Data of storage, process, analysis of each phase can see visualization. After data process through Hadoop, the result visualize from Mahout. According to this course, we can capture several features of customer, we can choose recommendation of financial item on time. This study introduce background and problem of big data and discuss development method and case study that how to create big data has new business opportunity through financial item recommendation case.

Modeling of Value Chain for Big Data (빅데이터를 위한 가치사슬 설계)

  • Lee, Sangwon;Park, Sungbum;Lee, Jumin;Ahn, Hyunsup;Choi, Yong Goo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.277-278
    • /
    • 2015
  • The volume sub-challenge requires novel approaches, often referred to as Big Data technologies and methodologies. Data is generated constantly in an ever growing number of places and by an ever growing number of actors while a large proportion of potentially re-usable data resides within silos within institutions or companies. These are needed when conventional database technologies cannot be applied to storage and computing issues. The issue of big data has been referred to as the next frontier in computing. In this paper, we research on factors to design an organizational value chain for Big Data.

  • PDF

An Efficient Implementation of Mobile Raspberry Pi Hadoop Clusters for Robust and Augmented Computing Performance

  • Srinivasan, Kathiravan;Chang, Chuan-Yu;Huang, Chao-Hsi;Chang, Min-Hao;Sharma, Anant;Ankur, Avinash
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.989-1009
    • /
    • 2018
  • Rapid advances in science and technology with exponential development of smart mobile devices, workstations, supercomputers, smart gadgets and network servers has been witnessed over the past few years. The sudden increase in the Internet population and manifold growth in internet speeds has occasioned the generation of an enormous amount of data, now termed 'big data'. Given this scenario, storage of data on local servers or a personal computer is an issue, which can be resolved by utilizing cloud computing. At present, there are several cloud computing service providers available to resolve the big data issues. This paper establishes a framework that builds Hadoop clusters on the new single-board computer (SBC) Mobile Raspberry Pi. Moreover, these clusters offer facilities for storage as well as computing. Besides the fact that the regular data centers require large amounts of energy for operation, they also need cooling equipment and occupy prime real estate. However, this energy consumption scenario and the physical space constraints can be solved by employing a Mobile Raspberry Pi with Hadoop clusters that provides a cost-effective, low-power, high-speed solution along with micro-data center support for big data. Hadoop provides the required modules for the distributed processing of big data by deploying map-reduce programming approaches. In this work, the performance of SBC clusters and a single computer were compared. It can be observed from the experimental data that the SBC clusters exemplify superior performance to a single computer, by around 20%. Furthermore, the cluster processing speed for large volumes of data can be enhanced by escalating the number of SBC nodes. Data storage is accomplished by using a Hadoop Distributed File System (HDFS), which offers more flexibility and greater scalability than a single computer system.

A Survey of Homomorphic Encryption for Outsourced Big Data Computation

  • Fun, Tan Soo;Samsudin, Azman
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3826-3851
    • /
    • 2016
  • With traditional data storage solutions becoming too expensive and cumbersome to support Big Data processing, enterprises are now starting to outsource their data requirements to third parties, such as cloud service providers. However, this outsourced initiative introduces a number of security and privacy concerns. In this paper, homomorphic encryption is suggested as a mechanism to protect the confidentiality and privacy of outsourced data, while at the same time allowing third parties to perform computation on encrypted data. This paper also discusses the challenges of Big Data processing protection and highlights its differences from traditional data protection. Existing works on homomorphic encryption are technically reviewed and compared in terms of their encryption scheme, homomorphism classification, algorithm design, noise management, and security assumption. Finally, this paper discusses the current implementation, challenges, and future direction towards a practical homomorphic encryption scheme for securing outsourced Big Data computation.