• Title/Summary/Keyword: Big data based modeling

Search Result 182, Processing Time 0.027 seconds

Big Data Analytics of Construction Safety Incidents Using Text Mining (텍스트 마이닝을 활용한 건설안전사고 빅데이터 분석)

  • Jeong Uk Seo;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.581-590
    • /
    • 2024
  • This study aims to extract key topics through text mining of incident records (incident history, post-incident measures, preventive measures) from construction safety accident case data available on the public data portal. It also seeks to provide fundamental insights contributing to the establishment of manuals for disaster prevention by identifying correlations between these topics. After pre-processing the input data, we used the LDA-based topic modeling technique to derive the main topics. Consequently, we obtained five topics related to incident history, and four topics each related to post-incident measures and preventive measures. Although no dominant patterns emerged from the topic pattern analysis, the study holds significance as it provides quantitative information on the follow-up actions related to the incident history, thereby suggesting practical implications for the establishment of a preventive decision-making system through the linkage between accident history and subsequent measures for reccurrence prevention.

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.

Big Data Based Urban Transportation Analysis for Smart Cities - Machine Learning Based Traffic Prediction by Using Urban Environment Data - (도시 빅데이터를 활용한 스마트시티의 교통 예측 모델 - 환경 데이터와의 상관관계 기계 학습을 통한 예측 모델의 구축 및 검증 -)

  • Jang, Sun-Young;Shin, Dong-Youn
    • Journal of KIBIM
    • /
    • v.8 no.3
    • /
    • pp.12-19
    • /
    • 2018
  • The research aims to find implications of machine learning and urban big data as a way to construct the flexible transportation network system of smart city by responding the urban context changes. This research deals with a problem that existing a bus headway model is difficult to respond urban situations in real-time. Therefore, utilizing the urban big data and machine learning prototyping tool in weathers, traffics, and bus statues, this research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data is gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is implemented by the machine learning tool (RapidMiner Studio) and conducted several tests for bus delays prediction according to specific circumstances. As a result, possibilities of transportation system are discussed for promoting the urban efficiency and the citizens' convenience by responding to urban conditions.

Probabilistic Modeling of Photovoltaic Power Systems with Big Learning Data Sets (대용량 학습 데이터를 갖는 태양광 발전 시스템의 확률론적 모델링)

  • Cho, Hyun Cheol;Jung, Young Jin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.5
    • /
    • pp.412-417
    • /
    • 2013
  • Analytical modeling of photovoltaic power systems has been receiving significant attentions in recent years in that it is easy to apply for prediction of its dynamics and fault detection and diagnosis in advanced engineering technologies. This paper presents a novel probabilistic modeling approach for such power systems with a big data sequence. Firstly, we express input/output function of photovoltaic power systems in which solar irradiation and ambient temperature are regarded as input variable and electric power is output variable respectively. Based on this functional relationship, conditional probability for these three random variables(such as irradiation, temperature, and electric power) is mathematically defined and its estimation is accomplished from ratio of numbers of all sample data to numbers of cases related to two input variables, which is efficient in particular for a big data sequence of photovoltaic powers systems. Lastly, we predict the output values from a probabilistic model of photovoltaic power systems by using the expectation theory. Two case studies are carried out for testing reliability of the proposed modeling methodology in this paper.

Federated Learning-based Route Choice Modeling for Preserving Driver's Privacy in Transportation Big Data Application (교통 빅데이터 활용 시 개인 정보 보호를 위한 연합학습 기반의 경로 선택 모델링)

  • Jisup Shim
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.6
    • /
    • pp.157-167
    • /
    • 2023
  • The use of big data for transportation often involves using data that includes personal information, such as the driver's driving routes and coordinates. This study explores the creation of a route choice prediction model using a large dataset from mobile navigation apps using federated learning. This privacy-focused method used distributed computing and individual device usage. This study established preprocessing and analysis methods for driver data that can be used in route choice modeling and compared the performance and characteristics of widely used learning methods with federated learning methods. The performance of the model through federated learning did not show significantly superior results compared to previous models, but there was no substantial difference in the prediction accuracy. In conclusion, federated learning-based prediction models can be utilized appropriately in areas sensitive to privacy without requiring relatively high predictive accuracy, such as a driver's preferred route choice.

COVID-19 News Analysis Using News Big Data : Focusing on Topic Modeling Analysis (뉴스 빅데이터를 활용한 코로나19 언론보도 분석 :토픽모델링 분석을 중심으로)

  • Kim, Tae-Jong
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.457-466
    • /
    • 2020
  • The purpose of this study is to find out what the main agenda of social formation is and how it changes through the media by utilizing the news big data of COVID-19 which is spreading recently, and to suggest the direction of future reporting. In order to achieve the purpose of the research, 47,816 cases of news big data reported from December 31, 2019 to March 11, 2020 were divided into four periods based on the fourth stage of the crisis warning for infectious diseases, and a total of 20 topics were derived. Based on the results of the Topic Modeling analysis, this study proposed the following. First, it is necessary to refrain from provocative expressions such as "anxiety" and "fear" and use neutral and objective reporting terms. Second, more in-depth and contextual news production is required, breaking away from simple event news production. Third, it is necessary to prepare detailed crisis communication manuals for each situation related to infectious diseases. Fourth, we need reports that focus on citizens-led efforts to overcome the crisis. This research has the academic significance that it is the first paper to analyze news big data on COVID-19 using the Topic Modeling Analysis method, and the policy significance that can be used as the basis for developing national crisis communication policy.

Forthcoming Big Data in Smart Cities: Experiment for Machine Learning Based Happiness Estimation in Seoul City (빅데이터를 이용한 서울시 행복지수 분석 및 예측을 위한 실험 및 고찰)

  • Shin, Dongyoun;Song, Yu-Mi
    • Journal of KIBIM
    • /
    • v.7 no.1
    • /
    • pp.28-35
    • /
    • 2017
  • Cities have complex system composed diverse activities. The activities in cities have complex relationship that creates diverse urban phenomena. Big Data is emerging technology in order to understand such complex network. This research aims to understand such relations by analysing the diverse city indexes. 28 indexes were collected in 25 of districts in Seoul city and analysed to find a weighted correlation. By defining the correlation values of certain years, it tries to predict the missed index values, "happiness" of each districts in other years. The result presents that the overall prediction accuracy 70.25%. However, for further discussion, the result is considered that this methods may not enough to use in practice, since the data has inconstant accuracy by different learning years.

Study of Mental Disorder Schizophrenia, based on Big Data

  • Hye-Sun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.279-285
    • /
    • 2023
  • This study provides academic implications by considering trends of domestic research regarding therapy for Mental disorder schizophrenia and psychosocial. For the analysis of this study, text mining with the use of R program and social network analysis method have been used and 65 papers have been collected The result of this study is as follows. First, collected data were visualized through analysis of keywords by using word cloud method. Second, keywords such as intervention, schizophrenia, research, patients, program, effect, society, mind, ability, function were recorded with highest frequency resulted from keyword frequency analysis. Third, LDA (latent Dirichlet allocation) topic modeling result showed that classified into 3 keywords: patient, subjects, intervention of psychosocial, efficacy of interventions. Fourth, the social network analysis results derived connectivity, closeness centrality, betweennes centrality. In conclusion, this study presents significant results as it provided basic rehabilitation data for schizophrenia and psychosocial therapy through new research methods by analyzing with big data method by proposing the results through visualization from seeking research trends of schizophrenia and psychosocial therapy through text mining and social network analysis.

The Impact of Exploration and Exploitation Activities and Market Agility on the Relationship between Big Data Analytics Capability and Firms' Performance (빅 데이터 분석능력과 기업 성과 간의 관계에서 혁신 및 개선 활동과 시장 민첩성의 영향)

  • Jung, He-Kyung;Boo, Jeman
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.3
    • /
    • pp.150-162
    • /
    • 2022
  • This study investigated the impact of the latest developments in big data analytics capabilities (BDAC) on firm performance. The BDAC have the power to innovate existing management practices. Nevertheless, their impact on firm performance has not been fully is not yet fully elucidated. The BDAC relates to the flexibility of infrastructure as well as the skills of management and firm's personnel. Most studies have explored the phenomena from a theoretical perspective or based on factors such as organizational characteristics. However, this study extends the flow of previous research by proposing and testing a model which examines whether organizational exploration, exploitation and market agility mediate the relationship between the BDAC and firm performance. The proposed model was tested using survey data collected from the long-term employees over 10 years in 250 companies. The results analyzed through structural equation modeling show that a strong BDAC can help improve firm performance. An organization's ability to analyze big data affects its exploration and exploitation thereby affecting market agility, and, consequently, firm performance. These results also confirm the powerful mediating role of exploration, exploitation, and market agility in improving insights into big data utilization and improving firm performance.

Systemic Analysis of Research Activities and Trends Related to Artificial Intelligence(A.I.) Technology Based on Latent Dirichlet Allocation (LDA) Model (Latent Dirichlet Allocation (LDA) 모델 기반의 인공지능(A.I.) 기술 관련 연구 활동 및 동향 분석)

  • Chung, Myoung Sug;Lee, Joo Yeoun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.23 no.3
    • /
    • pp.87-95
    • /
    • 2018
  • Recently, with the technological development of artificial intelligence, related market is expanding rapidly. In the artificial intelligence technology field, which is still in the early stage but still expanding, it is important to reduce uncertainty about research direction and investment field. Therefore, this study examined technology trends using text mining and topic modeling among big data analysis methods and suggested trends of core technology and future growth potential. We hope that the results of this study will provide researchers with an understanding of artificial intelligence technology trends and new implications for future research directions.