DOI QR코드

DOI QR Code

Optimization of Data Placement using Principal Component Analysis based Pareto-optimal method for Multi-Cloud Storage Environment

  • Latha, V.L. Padma (Department of CSE, SVCE Tirupati, JNTUA University) ;
  • Reddy, N. Sudhakar (Department of CSE, SVCE) ;
  • Babu, A. Suresh (Department of CSE, JNTUA University)
  • Received : 2021.12.05
  • Published : 2021.12.30

Abstract

Now that we're in the big data era, data has taken on a new significance as the storage capacity has exploded from trillion bytes to petabytes at breakneck pace. As the use of cloud computing expands and becomes more commonly accepted, several businesses and institutions are opting to store their requests and data there. Cloud storage's concept of a nearly infinite storage resource pool makes data storage and access scalable and readily available. The majority of them, on the other hand, favour a single cloud because of the simplicity and inexpensive storage costs it offers in the near run. Cloud-based data storage, on the other hand, has concerns such as vendor lock-in, privacy leakage and unavailability. With geographically dispersed cloud storage providers, multicloud storage can alleviate these dangers. One of the key challenges in this storage system is to arrange user data in a cost-effective and high-availability manner. A multicloud storage architecture is given in this study. Next, a multi-objective optimization problem is defined to minimise total costs and maximise data availability at the same time, which can be solved using a technique based on the non-dominated sorting genetic algorithm II (NSGA-II) and obtain a set of non-dominated solutions known as the Pareto-optimal set.. When consumers can't pick from the Pareto-optimal set directly, a method based on Principal Component Analysis (PCA) is presented to find the best answer. To sum it all up, thorough tests based on a variety of real-world cloud storage scenarios have proven that the proposed method performs as expected.

Keywords

References

  1. Langmead B, Nellore A. Cloud computing for genomic data analysis and collaboration. Nat Rev Genet 2018;19:208-19. https://doi.org/10.1038/nrg.2017.113
  2. 1000 Genomes Project Consortium, Abecasis GR, Altshuler D, Brooks LD, Durbin RM, Gibbs RA, et al. A map of human genome variation from population-scale sequencing. Nature 2010;467:1061-73. https://doi.org/10.1038/nature09534
  3. Exome Aggregation Consortium, Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 2016;536:285-91. https://doi.org/10.1038/nature19057
  4. Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I. and Zaharia, M., 2010. A view of cloud computing. Communications of the ACM, 53(4), pp.50-58. https://doi.org/10.1145/1721654.1721672
  5. Gong, C., Liu, J., Zhang, Q., Chen, H. and Gong, Z., 2010, September. The characteristics of cloud computing. In 2010 39th International Conference on Parallel Processing Workshops (pp. 275-279). IEEE.
  6. Dillon, T., Wu, C. and Chang, E., 2010, April. Cloud computing: issues and challenges. In 2010 24th IEEE international conference on advanced information networking and applications (pp. 27-33). Ieee.
  7. Odun-Ayo, I., Ananya, M., Agono, F. and Goddy-Worlu, R., 2018, July. Cloud computing architecture: A critical analysis. In 2018 18th international conference on computational science and applications (ICCSA) (pp. 1-7). IEEE.
  8. Clarke, R., 2010, May. User requirements for cloud computing architecture. In 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (pp. 625-630). IEEE.
  9. Agrawal, D., El Abbadi, A., Antony, S. and Das, S., 2010, March. Data management challenges in cloud computing infrastructures. In International Workshop on Databases in Networked Information Systems (pp. 1-10). Springer, Berlin, Heidelberg.
  10. Zhao, L., Sakr, S., Liu, A. and Bouguettaya, A., 2014. Cloud data management (pp. 1-189). Cham, Switzerland: Springer.
  11. Sakr, S., Liu, A., Batista, D.M. and Alomari, M., 2011. A survey of large scale data management approaches in cloud environments. IEEE Communications Surveys & Tutorials, 13(3), pp.311-336. https://doi.org/10.1109/SURV.2011.032211.00087
  12. Agrawal, D., El Abbadi, A., Emekci, F., Metwally, A. and Wang, S., 2011. Secure data management service on cloud computing infrastructures. In New Frontiers in Information and Software as Services (pp. 57-80). Springer, Berlin, Heidelberg.
  13. Djebbar, E.I. and Belalem, G., 2016, June. Tasks scheduling and resource allocation for high data management in scientific cloud computing environment. In International Conference on Mobile, Secure, and Programmable Networking (pp. 16-27). Springer, Cham.
  14. M. A. Alzain, E. Pardede, B. Soh, and J. A. Thom. Cloud computing security: From single to multi-clouds. In Proceedings of the 45th IEEE Conference on System Sciences, pp. 5490-5499, Hawaii, 2012.
  15. A. Bessani, M. Correia, B. Quaresma, F. Andr, and P. Sousa. Depsky: Dependable and secure storage in a cloud-of-clouds. ACM Transactions on Storage (TOS), 9(4):12, 2013.
  16. R. Rodrigues and B. Liskov. High availability in dhts: Erasure coding vs. replication. In Proceedings of the 2005 International Conference on Peer-To-Peer Systems, pp. 226-239, 2005.
  17. H. Weatherspoon and J. Kubiatowicz. Erasure coding vs. replication: A quantitative comparison. In Proceedings of the 2002 International Workshop on Peer-To-Peer Systems, 1:328-338, 2002.
  18. R. Moussa, R. Moussa, M. Swany, and T. Niemi. Erasure codes for increasing the availability of grid data storage. In Proceedings of the 2006 International Conference on Internet and Web Applications and Services, pp. 185-185, Guadelope, French, April 2006.
  19. Q. Wei, B. Veeravalli, B. Gong, L. Zeng, and D. Feng. Cdrm: A costeffective dynamic replication management scheme for cloud storage cluster. In Proceedings of the 2010 IEEE International Conference on CLUSTER Computing, pp. 188-196, Heraklion Greece, October 2010.
  20. Y. Singh, F. Kandah, and W. Zhang. A secured cost-effective multicloud storage in cloud computing. In Proceedings of the 2011 IEEE conference on Computer Communications Workshops, pp. 619-624, Shanghai China, June 2011.
  21. T. G. Papaioannou, N. Bonvin, and K. Aberer. Scalia: An adaptive scheme for efficient multi-cloud storage. In Proceedings of the 2012 International Conference on High PERFORMANCE Computing, Networking, Storage and Analysis, pp. 1-10, Utah USA, November 2012.
  22. Q. Zhang, S. Li, Z. Li, Y. Xing, Z. Yang, and Y. Dai. Charm: A cost-efficient multi-cloud data hosting scheme with high availability. IEEE Transactions on Cloud Computing, 3(3):372-386, 2015. https://doi.org/10.1109/TCC.2015.2417534
  23. Abu-Libdeh, H. Princehouse, L. Weatherspoon, H.: RACS: A Case for Cloud Storage Diversity. Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10), New York, NY, USA, 2010, pp. 229-240.
  24. Papaioannou, T. G. Bonvin, N. Aberer, K.: Scalia: An Adaptive Scheme for Efficient Multi-Cloud Storage. Proceedings of the 2012 International Conference on High Performance Computing, Networking, Storage and Analysis (SC '12), Utah, USA, 2012.
  25. Mansouri, Y. Toosi, A. N. Buyya, R.: Brokering Algorithms for Optimizing the Availability and Cost of Cloud Storage Services. Proceedings of 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, Bristol, UK, 2013, pp. 581-589.
  26. Hadji, M., 2015, May. Scalable and cost-efficient algorithms for reliable and distributed cloud storage. In International Conference on Cloud Computing and Services Science (pp. 15-37). Springer, Cham.
  27. Ma, Y. Nandagopal, T. Puttaswamy, K. P. Banerjee, S.: An Ensemble of Replication and Erasure Codes for Cloud File Systems. Proceedings of 2013 INFOCOM, Turin, Italy, 2013, pp. 1276-1284.
  28. Wang, P. Zhao, C. Zhang, Z.: An Ant Colony Algorithm. Based Approach for Cost-Effective Data Hosting with High Availability in Multi-Cloud Environments. Proceedings of 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), Zhuhai, China, 2018.
  29. Su, M. Zhang, L. Wu, Y. Chen, K. Li, K.: Systematic Data Placement Optimization in Multi-Cloud Storage for Complex Requirements. IEEE Transactions on Computers, Vol. 65, 2016, No. 6, pp. 1964-1977. https://doi.org/10.1109/TC.2015.2462821
  30. Cloudharmony, 2017. [Online] Available at: http://www.cloudharmony.com.