• Title/Summary/Keyword: Big data Problem

Search Result 571, Processing Time 0.026 seconds

Big Data Analysis Using Principal Component Analysis (주성분 분석을 이용한 빅데이터 분석)

  • Lee, Seung-Joo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.6
    • /
    • pp.592-599
    • /
    • 2015
  • In big data environment, we need new approach for big data analysis, because the characteristics of big data, such as volume, variety, and velocity, can analyze entire data for inferring population. But traditional methods of statistics were focused on small data called random sample extracted from population. So, the classical analyses based on statistics are not suitable to big data analysis. To solve this problem, we propose an approach to efficient big data analysis. In this paper, we consider a big data analysis using principal component analysis, which is popular method in multivariate statistics. To verify the performance of our research, we carry out diverse simulation studies.

A Study on the Problems of AI-based Security Control (AI 기반 보안관제의 문제점 고찰)

  • Ahn, Jung-Hyun;Choi, Young-Ryul;Baik, Nam-Kyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.452-454
    • /
    • 2021
  • Currently, the security control market is operating based on AI technology. The reason for using AI is to detect large amounts of logs and big data between security equipment, and to alleviate time and human problems. However, problems are still occurring in the application of AI. The security control market is responding to many problems other than the problems introduced in this paper, and this paper attempts to deal with five problems. We would like to consider problems that arise in applying AI technology to security control environments such as 'AI model selection', 'AI standardization problem', 'Big data accuracy', 'Security Control Big Data Accuracy and AI Reliability', 'responsibility material problem', and 'lack of AI validity.'

  • PDF

Anonymity Personal Information Secure Method in Big Data environment (빅데이터 환경에서 개인정보 익명화를 통한 보호 방안)

  • Hong, Sunghyuck;Park, Sang-Hee
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.1
    • /
    • pp.179-185
    • /
    • 2018
  • Big Data is strictly positioning one of method to deal with problems faced with mankind, not an icon of revolution in future anymore. Application of Big Data and protection of personal information have contradictoriness. When we weight more to usage of Big Data, someone's privacy is necessarily invaded. otherwise, we care more about keeping safe of individual information, only low-level of research using Big Data can be used to accomplish public purpose. In this study, we propose a method to anonymize Big Data collected in order to investigate the problems of personal information infringement and utilize Big Data and protect personal. This will solve the problem of personal information infringement as well as utilizing Big Data.

Design of a Platform for Collecting and Analyzing Agricultural Big Data (농업 빅데이터 수집 및 분석을 위한 플랫폼 설계)

  • Nguyen, Van-Quyet;Nguyen, Sinh Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.149-158
    • /
    • 2017
  • Big data have been presenting us with exciting opportunities and challenges in economic development. For instance, in the agriculture sector, mixing up of various agricultural data (e.g., weather data, soil data, etc.), and subsequently analyzing these data deliver valuable and helpful information to farmers and agribusinesses. However, massive data in agriculture are generated in every minute through multiple kinds of devices and services such as sensors and agricultural web markets. It leads to the challenges of big data problem including data collection, data storage, and data analysis. Although some systems have been proposed to address this problem, they are still restricted either in the type of data, the type of storage, or the size of data they can handle. In this paper, we propose a novel design of a platform for collecting and analyzing agricultural big data. The proposed platform supports (1) multiple methods of collecting data from various data sources using Flume and MapReduce; (2) multiple choices of data storage including HDFS, HBase, and Hive; and (3) big data analysis modules with Spark and Hadoop.

A Study of AI Education Program Based on Big Data: Case Study of the General Education High School (빅데이터 기반 인공지능 교육프로그램 연구: 일반계 고등학교 사례를 중심으로)

  • Ye-Hee, Jeong;Hyoungbum, Kim;Ki Rak, Park;Sang-Mi, Yoo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.1
    • /
    • pp.83-92
    • /
    • 2023
  • The purpose of this research is to develop a creative education program that utilizes AI education program based on big data for general education high schools, and to investigate its effectiveness. In order to achieve the purpose of the research, we developed a creative education program using artificial intelligence based on big data for first-year general high school students, and carried out on-site classes at schools and a validation process by experts. In order to measure the creative problem-solving ability and class satisfaction of high school students, a creative problem-solving ability test was conducted before and after the program application, and a class satisfaction test was conducted after the program. The results of this study are as follows. First, AI education program based on big data were statistically effective to improve the creative problem solving ability according to independent sample t test about 'problem discovery and analysis', 'idea generation', 'execution plan', 'conviction and communication', and 'innovation tendency' except 'execution', 'the difference between pre- and post-scores of male student and female student' on first year high school students. Secondly, in satisfaction conducted after classes of AI education program based on big data, the average of 'Satisfaction', 'Interest', 'Participation', 'Persistence' were 3.56 to 3.92, and the overall average was 3.78. Therefore, it was investigated that there was a lesson effect of the AI education program based on big data developed in this research.

Trends in Hardware Acceleration Techniques for Fully Homomorphic Encryption Operations (완전동형암호 연산 가속 하드웨어 기술 동향)

  • Park, S.C.;Kim, H.W.;Oh, Y.R.;Na, J.C.
    • Electronics and Telecommunications Trends
    • /
    • v.36 no.6
    • /
    • pp.1-12
    • /
    • 2021
  • As the demand for big data and big data-based artificial intelligence (AI) technology increases, the need for privacy preservations for sensitive information contained in big data and for high-speed encryption-based AI computation systems also increases. Fully homomorphic encryption (FHE) is a representative encryption technology that preserves the privacy of sensitive data. Therefore, FHE technology is being actively investigated primarily because, with FHE, decryption of the encrypted data is not required in the entire data flow. Data can be stored, transmitted, combined, and processed in an encrypted state. Moreover, FHE is based on an NP-hard problem (Lattice problem) that cannot be broken, even by a quantum computer, because of its high computational complexity and difficulty. FHE boasts a high-security level and therefore is receiving considerable attention as next-generation encryption technology. However, despite being able to process computations on encrypted data, the slow computation speed due to the high computational complexity of FHE technology is an obstacle to practical use. To address this problem, hardware technology that accelerates FHE operations is receiving extensive research attention. This article examines research trends associated with developments in hardware technology focused on accelerating the operations of representative FHE schemes. In addition, the detailed structures of hardware that accelerate the FHE operation are described.

A Big Data Preprocessing using Statistical Text Mining (통계적 텍스트 마이닝을 이용한 빅 데이터 전처리)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.470-476
    • /
    • 2015
  • Big data has been used in diverse areas. For example, in computer science and sociology, there is a difference in their issues to approach big data, but they have same usage to analyze big data and imply the analysis result. So the meaningful analysis and implication of big data are needed in most areas. Statistics and machine learning provide various methods for big data analysis. In this paper, we study a process for big data analysis, and propose an efficient methodology of entire process from collecting big data to implying the result of big data analysis. In addition, patent documents have the characteristics of big data, we propose an approach to apply big data analysis to patent data, and imply the result of patent big data to build R&D strategy. To illustrate how to use our proposed methodology for real problem, we perform a case study using applied and registered patent documents retrieved from the patent databases in the world.

Five Forces Model of Computational Power: A Comprehensive Measure Method

  • Wu, Meixi;Guo, Liang;Yang, Xiaotong;Xie, Lina;Wang, Shaopeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2239-2256
    • /
    • 2022
  • In this paper, a model is proposed to comprehensively evaluate the computational power. The five forces model of computational power solves the problem that the measurement units of different indexes are not unified in the process of computational power evaluation. It combines the bidirectional projection method with TOPSIS method. This model is more scientific and effective in evaluating the comprehensive situation of computational power. Lastly, an example shows the validity and practicability of the model.

The suggestion of new big data platform for the strengthening of privacy and enabled of big data (개인정보 보안강화 및 빅데이터 활성화를 위한 새로운 빅데이터 플랫폼 제시)

  • Song, Min-Gu
    • Journal of Digital Convergence
    • /
    • v.14 no.12
    • /
    • pp.155-164
    • /
    • 2016
  • In this paper, we investigate and analyze big data platform published at home and abroad. The results had a problem with personal information security on each platform. In particular, there was a vulnerability in the encryption of personal information stored in big data representative of HBase NoSQL DB that is commonly used for big data platform. However, data encryption and decryption cause the system load. In this paper, we propose a method of encryption with HBase, encryption and decryption systems, and methods for applying the personal information management system (PMIS) for each step of the way and big data platform to reduce the load on the network to communicate. And we propose a new big data platform that reflects this. Therefore, the proposed Big Data platform will greatly contribute to the activation of Big Data used to obtain personal information security and system performance efficiency.

k-NN Join Based on LSH in Big Data Environment

  • Ji, Jiaqi;Chung, Yeongjee
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.99-105
    • /
    • 2018
  • k-Nearest neighbor join (k-NN Join) is a computationally intensive algorithm that is designed to find k-nearest neighbors from a dataset S for every object in another dataset R. Most related studies on k-NN Join are based on single-computer operations. As the data dimensions and data volume increase, running the k-NN Join algorithm on a single computer cannot generate results quickly. To solve this scalability problem, we introduce the locality-sensitive hashing (LSH) k-NN Join algorithm implemented in Spark, an approach for high-dimensional big data. LSH is used to map similar data onto the same bucket, which can reduce the data search scope. In order to achieve parallel implementation of the algorithm on multiple computers, the Spark framework is used to accelerate the computation of distances between objects in a cluster. Results show that our proposed approach is fast and accurate for high-dimensional and big data.