• Title/Summary/Keyword: 빅데이터 분석방법

Search Result 876, Processing Time 0.026 seconds

Big Data Analysis Using Principal Component Analysis (주성분 분석을 이용한 빅데이터 분석)

  • Lee, Seung-Joo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.6
    • /
    • pp.592-599
    • /
    • 2015
  • In big data environment, we need new approach for big data analysis, because the characteristics of big data, such as volume, variety, and velocity, can analyze entire data for inferring population. But traditional methods of statistics were focused on small data called random sample extracted from population. So, the classical analyses based on statistics are not suitable to big data analysis. To solve this problem, we propose an approach to efficient big data analysis. In this paper, we consider a big data analysis using principal component analysis, which is popular method in multivariate statistics. To verify the performance of our research, we carry out diverse simulation studies.

Development of Virtual Fusion Methodology for Analysis Via Mobility Bigdata (모빌리티 빅데이터 가상결합 분석방법론 연구)

  • Bumchul Cho;Kihun Kwon;Deokbae An
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.75-90
    • /
    • 2022
  • Recently, complex and sophisticated analysis of transportation is required due to changes in the socioeconomic environment and the development of bigdata technology. Especially, the revision of 3 laws including PERSONAL INFORMATION PROTECTION ACT makes it possible to combine various types of mobility data. But strengthen personal information protection makes inefficiency in utilizing mobility bigdata. In this paper, we proposed the "Virtual fusion methdology via mobility bigdata" which is a methodology for indirect data fusion for various mobility bigdata such as mobile data and transportation card data, in order to resolve legal restrictions and enable various transportation analysis. And we also analyzed regional bus passenger in Seoul capital area and Cheongju city with aforementioned methodology for verification. This methdology could analyze behavioral pattern of passenger with the MCGM(Mobility Comprehensive Genetic Map), graph with position and time, making with mobile data. Consquently, using MCGM, which is a result for indirect data fusion, makes it possible to analyze various transportation problems.

A Big Data Preprocessing using Statistical Text Mining (통계적 텍스트 마이닝을 이용한 빅 데이터 전처리)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.470-476
    • /
    • 2015
  • Big data has been used in diverse areas. For example, in computer science and sociology, there is a difference in their issues to approach big data, but they have same usage to analyze big data and imply the analysis result. So the meaningful analysis and implication of big data are needed in most areas. Statistics and machine learning provide various methods for big data analysis. In this paper, we study a process for big data analysis, and propose an efficient methodology of entire process from collecting big data to implying the result of big data analysis. In addition, patent documents have the characteristics of big data, we propose an approach to apply big data analysis to patent data, and imply the result of patent big data to build R&D strategy. To illustrate how to use our proposed methodology for real problem, we perform a case study using applied and registered patent documents retrieved from the patent databases in the world.

Big data and statistics (빅데이터와 통계학)

  • Kim, Yongdai;Cho, Kwang Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.959-974
    • /
    • 2013
  • We investigate the roles of statistics and statisticians in the big data era. Definition and application areas of big data are reviewed and statistical characteristics of big data and their meanings are discussed. Various statistical methodologies applicable to big data analysis are illustrated, and two real big data projects are explained.

A Review of Research on Big Data Security (빅데이터 보안 분야의 연구동향 분석)

  • Park, Seokyee;Hwang, K.T.
    • Informatization Policy
    • /
    • v.23 no.1
    • /
    • pp.3-19
    • /
    • 2016
  • The purpose of the study is to analyze the existing literature and to suggest future research directions in the big data security area. This study identifies 62 research articles and analyses their publication year, publication media, general research approach, specific research method, and research topic. According to the results of the analyses, big data security research is at its intial stage in which non-empirical studies and research dealing with technical issues are dominant. From the research topic perspective, the area demonstrates the signs of initial research stage in which proportion of the macro studies dealing with overall issues is far higher than the micro ones covering specific implementation methods and sectoral issues. A few promising topics for future research include overarching framework on big data security, big data security methods for different industries, and government policies on big data security. Currently, the big data security area does not have sufficient research results. In the future, studies covering various topics in big data security from multiple perspectives are anticipated.

A Big Data Learning for Patent Analysis (특허분석을 위한 빅 데이터학습)

  • Jun, Sunghae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.406-411
    • /
    • 2013
  • Big data issue has been considered in diverse fields. Also, big data learning has been required in all areas such as engineering and social science. Statistics and machine learning algorithms are representative tools for big data learning. In this paper, we study learning tools for big data and propose an efficient methodology for big data learning via legacy data to practical application. We apply our big data learning to patent analysis, because patent is one of big data. Also, we use patent analysis result for technology forecasting. To illustrate how the proposed methodology could be applied in real domain, we will retrieve patents related to big data from patent databases in the world. Using searched patent data, we perform a case study by text mining preprocessing and multiple linear regression of statistics.

Conditions and potentials of Korean history research based on 'big data' analysis: the beginning of 'digital history' ('빅데이터' 분석 기반 한국사 연구의 현황과 가능성: 디지털 역사학의 시작)

  • Lee, Sangkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1007-1023
    • /
    • 2016
  • This paper explores the conditions and potential of newly designed and tried methodology of big data analysis that apply to Korean history subject matter. In order to advance them, we need to pay more attention to quantitative analysis methodologies over pre-existing qualitative analysis. To obtain our new challenge, I propose 'digital history' methods along with associated disciplines such as linguistics and computer science, data science and statistics, and visualization techniques. As one example, I apply interdisciplinary convergence approaches to the principle and mechanism of elite reproduction during the Korean medieval age. I propose how to compensate for a lack of historical material by applying a semi-supervised learning method, how to create a database that utilizes text-mining techniques, how to analyze quantitative data with statistical methods, and how to indicate analytical outcomes with intuitive visualization.

Current Status of Educational Big Data Research (교육 빅데이터 관련 연구 동향)

  • Lee, Eun-young;Park, Do-oung;Choi, In-ong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.175-176
    • /
    • 2014
  • 본고에서는 교육 빅데이터의 개념, 가치, 처리 기술 및 분석 방법 등을 탐색하였다. '온라인과 오프라인 교수 학습 활동의 투입, 과정, 산출을 통해 생산되는 국가, 지역, 학교, 교사, 학생 수준의 자료'로 정의할 수 있는 교육 빅데이터는 Hadoop으로 대표되는 분산 컴퓨팅 기술을 통해 효율적으로 처리할 수 있다. 대규모 교육 자료에서 의미있고 유용한 결과를 도출하기 위해 주로 사용되는 분석 방법에는 교육 데이터 마이닝, 학습 분석학과 시각 자료 분석학이 있다. 교육 데이터 마이닝은 학생과 교사, 학교의 다양한 수준에서 자료를 폭넓게 분석하는 측면이 강한 반면에 학습 분석학은 학생 수준에서의 자료 분석에 더 초점을 맞추는 경향이 있으며, 시각 자료 분석학은 자료에 대한 분석 자체보다는 분석 결과를 효과적으로 표현하는 방식에 초점이 주어져 있다.

  • PDF

Keyword Data Analysis Using Bayesian Conjugate Prior Distribution (베이지안 공액 사전분포를 이용한 키워드 데이터 분석)

  • Jun, Sunghae
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.1-8
    • /
    • 2020
  • The use of text data in big data analytics has been increased. So, much research on methods for text data analysis has been performed. In this paper, we study Bayesian learning based on conjugate prior for analyzing keyword data extracted from text big data. Bayesian statistics provides learning process for updating parameters when new data is added to existing data. This is an efficient process in big data environment, because a large amount of data is created and added over time in big data platform. In order to show the performance and applicability of proposed method, we carry out a case study by analyzing the keyword data from real patent document data.

An Analysis of Twenties Fashion Trend based on Big data (빅데이터를 기반으로 한 20대의 패션 트렌드 분석)

  • Yang, Yoon-jung;Um, Byung-yong;Hong, Sung-yub;Yu, Donghui
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.05a
    • /
    • pp.121-123
    • /
    • 2014
  • 최근 빅데이터의 등장으로 그에 따른 활용이 굉장히 광범위 해지고 있다. 빅데이터를 기반으로 한 행정자치 및 교통통제 서비스가 사용되고 있으며 앞으로도 빅데이터를 활용한 많은 서비스들을 사용할 수 있을 것이다. 이에 본 논문에서는 20대들이 자주 찾는 매장의 판매 데이터나 온라인 쇼핑몰의 검색 및 조회 순 등의 빅데이터를 정의하고 이를 활용한 20대 패션 트렌드 분석 방법을 제안하고 최적의 상품 진열 방법 등을 제시하여 판매율을 제고시키고자 한다.

  • PDF