• Title/Summary/Keyword: 빅 데이터 패턴 분석

Search Result 195, Processing Time 0.026 seconds

Development of integrated management solution through log analysis based on Big Data (빅데이터기반의 로그분석을 통한 통합 관리 솔루션 개발)

  • Kang, Sun-Kyoung;Lee, Hyun-Chang;Shin, Seong-Yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.541-542
    • /
    • 2017
  • In this paper, we intend to develop an integrated management solution that can be easily operated by integrating complex and various cloud environments. This has the advantage that users and administrators can conveniently solve problems by collecting and analyzing fixed log data and unstructured log data based on big data and realizing integrated monitoring in real time. Hypervisor log pattern analysis technology will be able to manage existing complex and various cloud environment more efficiently.

  • PDF

SNA Pattern Analysis on the Public Software Industry based on Open API Big Data from Korea Public Procurement Service (조달청 OPEN API 빅데이터를 활용한 공공 소프트웨어 산업의 SNA 패턴 분석)

  • KIM, Sojung lucia;Shim, Seon-Young;Seo, Yong-Won
    • Informatization Policy
    • /
    • v.24 no.3
    • /
    • pp.42-66
    • /
    • 2017
  • This study investigated the ecological change of public software industry, comparing the pre and post structure of industry network based on the application of the regulation restricting large company participation in public software market. For this purpose, we used big data of the software market from Korea Public Procurement Service and used the SNA(Social Network Analysis) methodology which is being actively used in the area of social science recently. Finally, we highlighted the contribution of open public data. By analyzing order and contract data of the public software industry for 3 years - from 2013 to 2015 - we found out two main things. First, we observed that Power Law distribution had been going on in the public software industry, regardless of the external impact of regulation. Second, despite the existence of such Power Law distribution, we also observed the ecological change of industry structure from year to year. We presented the implication of such findings and discussed the advantage of open public data as the original motivator of this study.

A Development Method of Framework for Collecting, Extracting, and Classifying Social Contents

  • Cho, Eun-Sook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.163-170
    • /
    • 2021
  • As a big data is being used in various industries, big data market is expanding from hardware to infrastructure software to service software. Especially it is expanding into a huge platform market that provides applications for holistic and intuitive visualizations such as big data meaning interpretation understandability, and analysis results. Demand for big data extraction and analysis using social media such as SNS is very active not only for companies but also for individuals. However despite such high demand for the collection and analysis of social media data for user trend analysis and marketing, there is a lack of research to address the difficulty of dynamic interlocking and the complexity of building and operating software platforms due to the heterogeneity of various social media service interfaces. In this paper, we propose a method for developing a framework to operate the process from collection to extraction and classification of social media data. The proposed framework solves the problem of heterogeneous social media data collection channels through adapter patterns, and improves the accuracy of social topic extraction and classification through semantic association-based extraction techniques and topic association-based classification techniques.

A dimensional reduction method in cluster analysis for multidimensional data: principal component analysis and factor analysis comparison (다차원 데이터의 군집분석을 위한 차원축소 방법: 주성분분석 및 요인분석 비교)

  • Hong, Jun-Ho;Oh, Min-Ji;Cho, Yong-Been;Lee, Kyung-Hee;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.135-143
    • /
    • 2020
  • This paper proposes a pre-processing method and a dimensional reduction method in the analysis of shopping carts where there are many correlations between variables when dividing the types of consumers in the agri-food consumer panel data. Cluster analysis is a widely used method for dividing observational objects into several clusters in multivariate data. However, cluster analysis through dimensional reduction may be more effective when several variables are related. In this paper, the food consumption data surveyed of 1,987 households was clustered using the K-means method, and 17 variables were re-selected to divide it into the clusters. Principal component analysis and factor analysis were compared as the solution for multicollinearity problems and as the way to reduce dimensions for clustering. In this study, both principal component analysis and factor analysis reduced the dataset into two dimensions. Although the principal component analysis divided the dataset into three clusters, it did not seem that the difference among the characteristics of the cluster appeared well. However, the characteristics of the clusters in the consumption pattern were well distinguished under the factor analysis method.

Smartphone Usage Data Collection Application and Management Program for Big Data Analysis (빅데이터 분석을 위한 스마트폰 사용 데이터 수집 앱 및 관리 프로그램)

  • Jo, Seong-Min;Oh, Seung-Hyeon;Ahn, Ji-Woo;Lee, Myung-Suk
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.225-228
    • /
    • 2021
  • 본 연구는 스마트폰 중독과 관련된 다양한 분석을 위한 스마트폰 사용 앱과 관리자 웹을 개발하고자 한다. 연구방법으로 이전 연구에서 중요한 변수로 작용되었던 '화면 켠 횟수', '실사용시간-인지사용시간' 변수를 분석할 있도록 적용하여 스마트폰 사용시간, 사용량, 사용 앱, 화면 잠금을 해제한 횟수 등 다양한 데이터 수집이 가능한 앱을 개발한다. 관리자 웹은 수집된 데이터를 저장, 분석할 수 있는 공간으로 사용할 것이다. 앱에서 수집된 데이터는 서버에 전송한 후, 시각화 분석 기능을 제공하는 관리 프로그램으로 개발하여 스마트폰 중독 연구에 사용한다. 향후 데이터 수집과 사용 목적에 동의한 사용자를 모집하여 데이터를 수집하고 스마트폰 사용 패턴, 데이터마이닝, 중독 등과 관련된 다양한 분석을 할 것이다. 이를 통해 보다 정확하고 효과적인 스마트폰 중독 진단이 가능해질 것과 나아가 스마트폰 중독 치료방안 연구에 기여할 것으로 기대한다.

  • PDF

PPFP(Push and Pop Frequent Pattern Mining): A Novel Frequent Pattern Mining Method for Bigdata Frequent Pattern Mining (PPFP(Push and Pop Frequent Pattern Mining): 빅데이터 패턴 분석을 위한 새로운 빈발 패턴 마이닝 방법)

  • Lee, Jung-Hun;Min, Youn-A
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.12
    • /
    • pp.623-634
    • /
    • 2016
  • Most of existing frequent pattern mining methods address time efficiency and greatly rely on the primary memory. However, in the era of big data, the size of real-world databases to mined is exponentially increasing, and hence the primary memory is not sufficient enough to mine for frequent patterns from large real-world data sets. To solve this problem, there are some researches for frequent pattern mining method based on disk, but the processing time compared to the memory based methods took very time consuming. There are some researches to improve scalability of frequent pattern mining, but their processes are very time consuming compare to the memory based methods. In this paper, we present PPFP as a novel disk-based approach for mining frequent itemset from big data; and hence we reduced the main memory size bottleneck. PPFP algorithm is based on FP-growth method which is one of the most popular and efficient frequent pattern mining approaches. The mining with PPFP consists of two setps. (1) Constructing an IFP-tree: After construct FP-tree, we assign index number for each node in FP-tree with novel index numbering method, and then insert the indexed FP-tree (IFP-tree) into disk as IFP-table. (2) Mining frequent patterns with PPFP: Mine frequent patterns by expending patterns using stack based PUSH-POP method (PPFP method). Through this new approach, by using a very small amount of memory for recursive and time consuming operation in mining process, we improved the scalability and time efficiency of the frequent pattern mining. And the reported test results demonstrate them.

A MapReduce-Based Workflow BIG-Log Clustering Technique (맵리듀스기반 워크플로우 빅-로그 클러스터링 기법)

  • Jin, Min-Hyuck;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.87-96
    • /
    • 2019
  • In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.

The Effect of Data 3 on the Utilization of Medical Big Data for Early Detection of Dementia (데이터 3법이 치매 조기 예측을 위한 의료 빅데이터 활용에 미치는 영향 연구)

  • Kim, Hyejin
    • Journal of Digital Convergence
    • /
    • v.18 no.5
    • /
    • pp.305-315
    • /
    • 2020
  • As the incidence and prevalence of dementia increases with our aging population, so does the social burden on our society, which calls for a special emphasis on need for early diagnosis. Thus, efforts are made to prevent dementia and early detection but with current diagnostic measures, these efforts appear futile. As a solution, it is crucial to integrate and standardize healthcare big data and analysis of each index. In order to increase use of large database, the Korea National Assembly passed the Data 3 Act focusing on open-access and sharing of database, but a follow-up legislation is needed a for safer utilization. In this study, we have identified number of foreign of foreign policies through review of prior researches on the topic leading to specific enforcement ordinances tailored to the Data 3 Act for safe access and utilization of database. We also aimed to establish secure process of data collection and disposal as well as governance at the national level to ensure safe utilization of healthcare big data.

A Study on the Comparison Analysis of Travel Agencies using Social Big Data (소셜 빅 데이터를 이용한 여행사 비교 분석에 관한 연구)

  • Song, Eun-Jee;Kong, Hyou-Soon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.771-772
    • /
    • 2015
  • 소셜미디어 상 고객들이 쏟아내는 말을 실시간으로 분석, 조사하는 방법으로 버즈 모니터링 이라는 시스템을 이용하여 웹상의 다양한 정보를 자동으로 검색하고 수집하고 있다. 본 논문에서는 여행사에 관해 소셜 미디어 상의 빅 데이터를 이용하여 보다 정확하고 효율적인 정보 수집과 분석이 가능하도록 하기위한 분석 모델을 제안하고 실제 국내 여행사에 관해 비교 분석한다. 먼저 여행사별 인지도,이미지와 선호도 분석을 하고 관광관련 상품과 서비스에 대한 분석과 함께 소비자 분석으로서 관광의 목적, 동행인 등 소비자의 생활패턴에 대한 분석을 한다. 또한 여행사 관련 영향력자 경향을 트위터 상에서 살펴본 결과 해당 여행사 이용경험자와 관련 뉴스를 제공하는 언론, 이벤트에 관심 있는 사용자들로 유형화 할 수 있었다.

  • PDF

A Study on the Machine Learning Model for Product Faulty Prediction in Internet of Things Environment (사물인터넷 환경에서 제품 불량 예측을 위한 기계 학습 모델에 관한 연구)

  • Ku, Jin-Hee
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.1
    • /
    • pp.55-60
    • /
    • 2017
  • In order to provide intelligent services without human intervention in the Internet of Things environment, it is necessary to analyze the big data generated by the IoT device and learn the normal pattern, and to predict the abnormal symptoms such as faulty or malfunction based on the learned normal pattern. The purpose of this study is to implement a machine learning model that can predict product failure by analyzing big data generated in various devices of product process. The machine learning model uses the big data analysis tool R because it needs to analyze based on existing data with a large volume. The data collected in the product process include the information about product faulty, so supervised learning model is used. As a result of the study, I classify the variables and variable conditions affecting the product failure, and proposed a prediction model for the product failure based on the decision tree. In addition, the predictive power of the model was significantly higher in the conformity and performance evaluation analysis of the model using the ROC curve.