• Title/Summary/Keyword: bigdata analysis

Search Result 345, Processing Time 0.019 seconds

Fishery R&D Big Data Platform and Metadata Management Strategy (수산과학 빅데이터 플랫폼 구축과 메타 데이터 관리방안)

  • Kim, Jae-Sung;Choi, Youngjin;Han, Myeong-Soo;Hwang, Jae-Dong;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.93-103
    • /
    • 2019
  • In this paper, we introduce a big data platform and a metadata management technique for fishery science R & D information. The big data platform collects and integrates various types of fisheries science R & D information and suggests how to build it in the form of a data lake. In addition to existing data collected and accumulated in the field of fisheries science, we also propose to build a big data platform that supports diverse analysis by collecting unstructured big data such as satellite image data, research reports, and research data. Next, by collecting and managing metadata during data extraction, preprocessing and storage, systematic management of fisheries science big data is possible. By establishing metadata in a standard form along with the construction of a big data platform, it is meaningful to suggest a systematic and continuous big data management method throughout the data lifecycle such as data collection, storage, utilization and distribution.

  • PDF

A Study on Big Data Anti-Money Laundering Systems Design through A Bank's Case Analysis (A 은행 사례 분석을 통한 빅데이터 기반 자금세탁방지 시스템 설계)

  • Kim, Sang-Wan;Hahm, Yu-Kun
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.85-94
    • /
    • 2016
  • Traditional Anti-Money Laundering (AML) software applications monitor bank customer transactions on a daily basis using customer historical information and account profile data to provide a "whole picture" to bank management. With the advent of Big Data, these applications could be benefited from size, variety, and speed of unstructured data, which have not been used in AML applications before. This study analyses the weaknesses of a bank's current AML systems and proposes an AML systems taking advantage of Big Data. For example, early warning of AML risk can be improved by exposing identities and uncovering hidden relationships through predictive and entity analytics on real-time and outside data such as SNS data.

  • PDF

A Study on Sensor Data Analysis and Product Defect Improvement for Smart Factory (스마트 팩토리를 위한 센서 데이터 분석과 제품 불량 개선 연구)

  • Hwang, Sewong;Kim, Jonghyuk;Hwangbo, Hyunwoo
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.95-103
    • /
    • 2018
  • In recent years, many people in the manufacturing field have been making efforts to increase efficiency while analyzing manufacturing data generated in the process according to the development of ICT technology. In this study, we propose a data mining based manufacturing process using decision tree algorithm (CHAID) as part of a smart factory. We used 432 sensor data from actual manufacturing plant collected for about 5 months to find out the variables that show a significant difference between the stable process period with low defect rate and the unstable process period with high defect rate. We set the range of the stable value of the variable to determine whether the selected final variable actually has an effect on the defect rate improvement. In addition, we measured the effect of the defect rate improvement by adjusting the process set-point so that the sensor did not deviate from the stable value range in the 14 day process. Through this, we expect to be able to provide empirical guidelines to improve the defect rate by utilizing and analyzing the process sensor data generated in the manufacturing industry.

A Direction of Politic Support for Infectious Disease in Busan Using Time-series Clustering: Focusing on COVID-19 Cases (시계열 군집을 활용한 부산시 감염병 지원 정책 방향: COVID-19 사례를 중심으로)

  • Kwun, Hyeon-Ho;Kim, Do-Hee;Park, Chan-Ho;Lee, Eun-Ju;Cho, KiHaing;Bae, Hye-Rim
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.125-138
    • /
    • 2020
  • After the spread of COVID-19 in 2020, the country's Crisis Alert Level went up to the highest level, Level 4. Respond of COVID-19 pandemic, Governments, and cities secured each province's duty for the citizens. The government provided health assistance first and stepped forward to support the necessary resources for the citizens. Busan City proposed policy response to prepare and implement the Corona support for each county as well. The high occupant rate of self-business owners lost basic incomes, and the effect varies on industries. In our paper, to avoid any crisis in such an epidemic, we propose a clustering analysis for the guidance of policy support for Busan City. By analyzing patterns and clustering on districts and Sectors, we would like to provide reference materials for determining the direction of support and guiding preemptive response in the event of a similar epidemic.

An Implementation of Federated Learning based on Blockchain (블록체인 기반의 연합학습 구현)

  • Park, June Beom;Park, Jong Sou
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.89-96
    • /
    • 2020
  • Deep learning using an artificial neural network has been recently researched and developed in various fields such as image recognition, big data and data analysis. However, federated learning has emerged to solve issues of data privacy invasion and problems that increase the cost and time required to learn. Federated learning presented learning techniques that would bring the benefits of distributed processing system while solving the problems of existing deep learning, but there were still problems with server-client system and motivations for providing learning data. So, we replaced the role of the server with a blockchain system in federated learning, and conducted research to solve the privacy and security problems that are associated with federated learning. In addition, we have implemented a blockchain-based system that motivates users by paying compensation for data provided by users, and requires less maintenance costs while maintaining the same accuracy as existing learning. In this paper, we present the experimental results to show the validity of the blockchain-based system, and compare the results of the existing federated learning with the blockchain-based federated learning. In addition, as a future study, we ended the thesis by presenting solutions to security problems and applicable business fields.

Clustering Foursquare Users' Collective Activities: A Case of Seoul (포스퀘어 사용자의 집단적 활동 군집화: 서울시 사례)

  • Seo, Il-Jung;Cho, Jae-Hee
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.55-63
    • /
    • 2020
  • This study proposed an approach of clustering collective users' activities of location-based social networks using check-in data of Foursquare users in Seoul. In order to cluster the collective activities, we generated sequential rules of the activities using sequential rule mining, and then constructed activity networks based on the rules. We analyzed the activity networks to identify network structure and hub activities, and clustered the activities within the networks. Unlike previous studies that analyzed activity transition patterns of location-based social network users, this study focused on analyzing the structure and clusters of successive activities. Hubs and clusters of activities with the approach proposed in this study can be used for location-based services and marketing. They could also be used in the public sector, such as infection prevention and urban policies.

Prediction of Highy Pathogenic Avian Influenza(HPAI) Diffusion Path Using LSTM (LSTM을 활용한 고위험성 조류인플루엔자(HPAI) 확산 경로 예측)

  • Choi, Dae-Woo;Lee, Won-Been;Song, Yu-Han;Kang, Tae-Hun;Han, Ye-Ji
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.1-9
    • /
    • 2020
  • The study was conducted with funding from the government (Ministry of Agriculture, Food and Rural Affairs) in 2018 with support from the Agricultural, Food, and Rural Affairs Agency, 318069-03-HD040, and in based on artificial intelligence-based HPAI spread analysis and patterning. The model that is actively used in time series and text mining recently is LSTM (Long Short-Term Memory Models) model utilizing deep learning model structure. The LSTM model is a model that emerged to resolve the Long-Term Dependency Problem that occurs during the Backpropagation Through Time (BPTT) process of RNN. LSTM models have resolved the problem of forecasting very well using variable sequence data, and are still widely used.In this paper study, we used the data of the Call Detailed Record (CDR) provided by KT to identify the migration path of people who are expected to be closely related to the virus. Introduce the results of predicting the path of movement by learning the LSTM model using the path of the person concerned. The results of this study could be used to predict the route of HPAI propagation and to select routes or areas to focus on quarantine and to reduce HPAI spread.

A Study on Big Data Maturity Assessment Framework for Corporate Data Strategy and Investment (기업 데이터 전략과 투자를 위한 빅데이터 성숙도 평가 프레임워크 실증 연구)

  • Kim, Okki;Park, Jung;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.13-22
    • /
    • 2021
  • The purpose of this study is to develop and demonstrate a framework for evaluating the maturity of big data for effective data strategy establishment and efficient investment of companies. By supplementing the shortcomings of the evaluation developed so far, a framework was developed to evaluate the maturity of a company's big data in an integrated process. As a result, four evaluation areas of 'Vision and Strategy', 'Management', 'Analysis' and 'Utilization', assessment items for each area, detailed content, and criteria for each stage were derived. This was verified through a survey of entrepreneurs, and the maturity level of big data of domestic companies was confirmed. As a future research direction, it is proposed to develop detailed assessment factors according to the characteristics of each industry, to develop a data utilization framework according to the assessment results, and to improve validity and reliability through adjustment of verification targets.

Development of Smart City IoT Data Quality Indicators and Prioritization Focusing on Structured Sensing Data (스마트시티 IoT 품질 지표 개발 및 우선순위 도출)

  • Yang, Hyun-Mo;Han, Kyu-Bo;Lee, Jung Hoon
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.161-178
    • /
    • 2021
  • The importance of 'Big Data' is increasing to the point that it is likened to '21st century crude oil'. For smart city IoT data, attention should be paid to quality control as the quality of data is associated with the quality of public services. However, data quality indicators presented through ISO/IEC organizations and domestic/foreign organizations are limited to the 'User' perspective. To complement these limitations, the study derives supplier-centric indicators and their priorities. After deriving 3 categories and 13 indicators of supplier-oriented smart city IoT data quality evaluation indicators, we derived the priority of indicator categories and data quality indicators through AHP analysis and investigated the feasibility of each indicator. The study can contribute to improving sensor data quality by presenting the basic requirements that data should have to individuals or companies performing the task. Furthermore, data quality control can be performed based on indicator priorities to provide improvements in quality control task efficiency.

Risk Prediction and Analysis of Building Fires -Based on Property Damage and Occurrence of Fires- (건물별 화재 위험도 예측 및 분석: 재산 피해액과 화재 발생 여부를 바탕으로)

  • Lee, Ina;Oh, Hyung-Rok;Lee, Zoonky
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.133-144
    • /
    • 2021
  • This paper derives the fire risk of buildings in Seoul through the prediction of property damage and the occurrence of fires. This study differs from prior research in that it utilizes variables that include not only a building's characteristics but also its affiliated administrative area as well as the accessibility of nearby fire-fighting facilities. We use Ensemble Voting techniques to merge different machine learning algorithms to predict property damage and fire occurrence, and to extract feature importance to produce fire risk. Fire risk prediction was made on 300 buildings in Seoul utilizing the established model, and it has been derived that with buildings at Level 1 for fire risks, there were a high number of households occupying the building, and the buildings had many factors that could contribute to increasing the size of the fire, including the lack of nearby fire-fighting facilities as well as the far location of the 119 Safety Center. On the other hand, in the case of Level 5 buildings, the number of buildings and businesses is large, but the 119 Safety Center in charge are located closest to the building, which can properly respond to fire.