• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.022 seconds

Fintech Industry Invigoration by the De-identification and Linkage Reform of Personal Information (개인정보 비식별 조치와 결합 개선을 통한 핀테크 시장 활성화)

  • Oh, Won-Gyeom;Park, Dea-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.340-343
    • /
    • 2016
  • The Korean government published the personal information de-identification guideline on June 2016, which were made by related government ministries. The guideline's objective is that the invigoration of Korean bigdata industry on personal information protection under the current laws. However, if there is some unreasonable method or process in the guideline, it can be an obstacle to bigdata analysis. This article will review the guideline to find defects in methods and processes of de-identification evaluation, de-identification support and data-linkage and then propose the best solutions to improve them. Lastly, this article will mention how these solutions can invigorate Fintech industry.

  • PDF

Big Data Governance Model for Smart Water Management (스마트 물관리를 위한 빅데이터 거버넌스 모델)

  • Choi, Young-Hwan;Cho, Wan-Sup;Lee, Kyung-Hee
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.1-10
    • /
    • 2018
  • In the field of smart water management, there is an increasing demand for strengthening competitiveness through big data analysis. As a result, systematic management (Governance) of big data is becoming an important issue. Big data governance is a systematic approach to evaluating, directing and monitoring data management, such as data quality assurance, privacy protection, data lifetime management, data ownership and clarification of management rights. Failure to establish big data governance can lead to serious problems by using low quality data for critical decisions. In addition, personal privacy data can make Big Brother worry come true, and IT costs can skyrocket due to the neglect of data age management. Even if these technical problems are fixed, the big data effects will not be sustained unless there are organizations and personnel who are dedicated and responsible for data-related issues. In this paper, we propose a method of building data governance for smart water data management based on big data.

A Study on Predictive Modeling of Public Data: Survival of Fried Chicken Restaurants in Seoul (서울 치킨집 폐업 예측 모형 개발 연구)

  • Bang, Junah;Son, Kwangmin;Lee, So Jung Ashley;Lee, Hyeongeun;Jo, Subin
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.35-49
    • /
    • 2018
  • It seems unrealistic to say that fried chicken, often known as the American soul food, has one of the biggest markets in South Korea. Yet, South Korea owns more numbers of fried chicken restaurants than those of McDonald's franchise globally[4]. Needless to say not all these fast-food commerce survive in such small country. In this study, we propose a predictive model that could potentially help one's decision whilst deciding to open a store. We've extracted all fried chicken restaurants registered at the Korean Ministry of the Interior and Safety, then collected a number of features that seem relevant to a store's closure. After comparing the results of different algorithms, we conclude that in order to best predict a store's survival is FDA(Flexible Discriminant Analysis). While Neural Network showed the highest prediction rate, FDA showed better balanced performance considering sensitivity and specificity.

Design and Implementation of Deep Learning Models for Predicting Energy Usage by Device per Household (가구당 기기별 에너지 사용량 예측을 위한 딥러닝 모델의 설계 및 구현)

  • Lee, JuHui;Lee, KangYoon
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.127-132
    • /
    • 2021
  • Korea is both a resource-poor country and a energy-consuming country. In addition, the use and dependence on electricity is very high, and more than 20% of total energy use is consumed in buildings. As research on deep learning and machine learning is active, research is underway to apply various algorithms to energy efficiency fields, and the introduction of building energy management systems (BEMS) for efficient energy management is increasing. In this paper, we constructed a database based on energy usage by device per household directly collected using smart plugs. We also implement algorithms that effectively analyze and predict the data collected using RNN and LSTM models. In the future, this data can be applied to analysis of power consumption patterns beyond prediction of energy consumption. This can help improve energy efficiency and is expected to help manage effective power usage through prediction of future data.

Prevent and Track the Spread of Highy Pathogenic Avian Influenza Virus using Big Data (빅데이터를 활용한 HPAI Virus 확산 예방 및 추적)

  • Choi, Dae-Woo;Lee, Won-Been;Song, Yu-Han;Kang, Tae-Hun;Han, Ye-Ji
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.145-153
    • /
    • 2020
  • This study was conducted with funding from the government (Ministry of Agriculture, Food and Rural Affairs) in 2018 with support from the Agricultural, Food, and Rural Affairs Agency, 318069-03-HD040, and is based on artificial intelligence-based HPAI spread analysis and patterning. Highly Pathogenic Avian Influenza (HPAI) is coming from abroad through migratory birds, but it is not clear exactly how it spreads to farms. In addition, it is assumed that the main cause of the spread is the vehicle, but the main cause of the spread is not exactly known. However, it is necessary to analyze the relationship between the vehicles and the facilities at the farms where they occur, as the type of vehicles that visit the farms most frequently is between farms and facilities, such as livestock transportation and feed transportation. In this paper, based on the Korea Animal Health Integrated System (KAHIS) data provided by Animal and Plant Quarantine Agency, the main cause of HPAI virus transfer is to be confirmed between vehicles and facilities.

Demand Prediction of Furniture Component Order Using Deep Learning Techniques (딥러닝 기법을 활용한 가구 부자재 주문 수요예측)

  • Kim, Jae-Sung;Yang, Yeo-Jin;Oh, Min-Ji;Lee, Sung-Woong;Kwon, Sun-dong;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.111-120
    • /
    • 2020
  • Despite the recent economic contraction caused by the Corona 19 incident, interest in the residential environment is growing as more people live at home due to the increase in telecommuting, thereby increasing demand for remodeling. In addition, the government's real estate policy is also expected to have a visible impact on the sales of the interior and furniture industries as it shifts from regulatory policy to the expansion of housing supply. Accurate demand forecasting is a problem directly related to inventory management, and a good demand forecast can reduce logistics and inventory costs due to overproduction by eliminating the need to have unnecessary inventory. However, it is a difficult problem to predict accurate demand because external factors such as constantly changing economic trends, market trends, and social issues must be taken into account. In this study, LSTM model and 1D-CNN model were compared and analyzed by artificial intelligence-based time series analysis method to produce reliable results for manufacturers producing furniture components.

A Securities Company's Customer Churn Prediction Model and Causal Inference with SHAP Value (증권 금융 상품 거래 고객의 이탈 예측 및 원인 추론)

  • Na, Kwangtek;Lee, Jinyoung;Kim, Eunchan;Lee, Hyochan
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.215-229
    • /
    • 2020
  • The interest in machine learning is growing in all industries, but it is difficult to apply it to real-world tasks because of inexplicability. This paper introduces a case of developing a financial customer churn prediction model for a securities company, and introduces the research results on an attempt to develop a machine learning model that can be explained using the SHAP Value methodology and derivation of interpretability. In this study, a total of six customer churn models are compared and analyzed, and the cause of customer churn is inferred through the classification and data analysis of SHAP Value and the type of customer asset change. Based on the results of this study, it would be possible to use it as a basis for comprehensive judgment, such as using the Value of the deviation prediction result that can infer the cause of the marketing manager's actual customer marketing in the future and establishing a target marketing strategy for each customer.

Prediction of Agricultural Purchases Using Structured and Unstructured Data: Focusing on Paprika (정형 및 비정형 데이터를 이용한 농산물 구매량 예측: 파프리카를 중심으로)

  • Somakhamixay Oui;Kyung-Hee Lee;HyungChul Rah;Eun-Seon Choi;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.6 no.2
    • /
    • pp.169-179
    • /
    • 2021
  • Consumers' food consumption behavior is likely to be affected not only by structured data such as consumer panel data but also by unstructured data such as mass media and social media. In this study, a deep learning-based consumption prediction model is generated and verified for the fusion data set linking structured data and unstructured data related to food consumption. The results of the study showed that model accuracy was improved when combining structured data and unstructured data. In addition, unstructured data were found to improve model predictability. As a result of using the SHAP technique to identify the importance of variables, it was found that variables related to blog and video data were on the top list and had a positive correlation with the amount of paprika purchased. In addition, according to the experimental results, it was confirmed that the machine learning model showed higher accuracy than the deep learning model and could be an efficient alternative to the existing time series analysis modeling.

A Study on the Applicability of Safety Performance Indicators using the Density-Based Ship Domain (밀도기반 선박 도메인을 이용한 안전 성능 지표 활용성 연구)

  • Yeong-Jae Han;Sunghyun Sim;Hyerim Bae
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.89-97
    • /
    • 2022
  • Various efforts are needed to prevent accidents because ship collisions can cause various negative situations such as economic losses and casualties. Therefore, research to prevent accidents is being actively conducted, and in this study, new leading indicators for preventing ship collision accidents is proposed. In previous studies, the risk of collision was expressed in consideration of the distance between ships in a specific sea area, but there is a disadvantage that a new model needs to be developed to apply this to other sea areas. In this study, the density-based ship domain DESD (Density-based Empirical Ship Domain) including the environment and operating characteristics of the sea area was defined using AIS (Automatic Identification System) data, which is ship operation information. Deep clustering is applied to two-dimensional DESDs created for each sea area to cluster the seas with similar operating environments. Through the analysis of the relationship between clustered sea areas and ship collision accidents, it was statistically tested that the occurrence of accidents varies by characteristic of each sea area, and it was proved that DESD can be used as a leading indicator of accidents.

Prediction of OPS(On-base Plus Slugging) in KBO League (한국프로야구에서 장타율과 출루율(OPS) 예측 연구)

  • Dong Yun Shin;Jinho Kim
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.49-61
    • /
    • 2022
  • In sports, the proportion of data analysis in team management such as team strategy planning and marketing is increasing. In KBO(Korea Baseball Organization) league, in particular, plans such as recruiting players and fostering players are established to devise team strategies for the next year, such as FA and trade, at the end of a season. For these reasons, it is very important to predict players' performance for the next year. In this study, the target was limited to only the batter and tried to find out how to predict whether the performance of the next year will improve. As a standard record for rising and falling, OPS(On-Base Plus Slugging), which is easy to calculate and has a high relationship with team score, was used. In this study, 40 years of regular season data from 1982 to 2021 were used as data, and 11 machine learning classification models were used as experimental methods. Predicting the rise and fall of OPS, RBF SVM, Neural Net, Gaussian Process, and AdaBoost were more accurate than other classification models, and age did not significantly affect accuracy.