• Title/Summary/Keyword: Data analysis & prediction

Search Result 4,235, Processing Time 0.037 seconds

The Comparison Among Prediction Methods of Water Demand And Analysis of Data on Water Services Using Data Mining Techniques (데이터마이닝 기법을 활용한 상수 이용현황 분석 및 단기 물 수요예측 방법 비교)

  • Ahn, Jihoon;Kim, Jinhwa
    • The Journal of Bigdata
    • /
    • v.1 no.1
    • /
    • pp.9-17
    • /
    • 2016
  • This study identifies major features in water supply and introduces important factors in water services based on the information from data mining analysis of water quantity and water pressure measured from sensors. It also suggests more accurate methods using multiple regression analysis and neural network in predicting short term prediction of water demand in water service. A small block of a county is selected for the data collection and tests. There isa water demand on business such as public offices and hospitalstoo in this area. Real stream data from sensors in this area is collected. Among 2,728 data sets collected, 2,632 sets are used for modelling and 96 sets are used for testing. The shows that neural network is better than multiple regression analysis in their prediction performance.

  • PDF

MPIL: Market prediction through image learning of unstructured and structured data (비정형, 정형 데이터의 이미지 학습을 활용한 시장예측)

  • Lee, Yoon Seon;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.16-21
    • /
    • 2021
  • Financial time series analysis plays a very important role economically and socially in modern society and is an important task affecting global development, but due to difficulties such as a lot of noise and uncertainty, financial time series analysis prediction is a difficult research topic. In this paper, we propose a market prediction method (MPIL) by converting unstructured data and structured data into images. For market prediction, it analyzes SNS and news data, which is unstructured data for n days, and converts the market data, which is structured data, to an image with the GADF algorithm, and predicts an ultra-short market that predicts the price of n+1 days through image learning. MPIL has an average accuracy of 56%, which is higher than the 50% average accuracy of the model that predicts the market with LSTM by using sentiment analysis used for existing market forecasting.

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.

Bigdata Prediction Support Service for Citizen Data Scientists (시민 데이터과학자를 위한 빅데이터 예측 지원 서비스)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.2
    • /
    • pp.151-159
    • /
    • 2019
  • As the era of big data, which is the foundation of the fourth industry, has come, most related industries are developing related solutions focusing on the technologies of data storage, statistical analysis and visualization. However, for the diffusion of bigdata technology, it is necessary to develop the prediction analysis technologies using artificial intelligence. But these advanced technologies are only possible by some experts now called data scientists. For big data-related industries to develop, a non-expert, called a citizen data scientist, should be able to easily access the big data analysis process at low cost because they have insight into their own data. In this paper, we propose a system for analyzing bigdata and building business models with the support of easy-to-use analysis system without knowledge of high-level data science. We also define the necessary components and environment for the prediction analysis system and present the overall service plan.

Machine Learning Methodology for Management of Shipbuilding Master Data

  • Jeong, Ju Hyeon;Woo, Jong Hun;Park, JungGoo
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.12 no.1
    • /
    • pp.428-439
    • /
    • 2020
  • The continuous development of information and communication technologies has resulted in an exponential increase in data. Consequently, technologies related to data analysis are growing in importance. The shipbuilding industry has high production uncertainty and variability, which has created an urgent need for data analysis techniques, such as machine learning. In particular, the industry cannot effectively respond to changes in the production-related standard time information systems, such as the basic cycle time and lead time. Improvement measures are necessary to enable the industry to respond swiftly to changes in the production environment. In this study, the lead times for fabrication, assembly of ship block, spool fabrication and painting were predicted using machine learning technology to propose a new management method for the process lead time using a master data system for the time element in the production data. Data preprocessing was performed in various ways using R and Python, which are open source programming languages, and process variables were selected considering their relationships with the lead time through correlation analysis and analysis of variables. Various machine learning, deep learning, and ensemble learning algorithms were applied to create the lead time prediction models. In addition, the applicability of the proposed machine learning methodology to standard work hour prediction was verified by evaluating the prediction models using the evaluation criteria, such as the Mean Absolute Percentage Error (MAPE) and Root Mean Squared Logarithmic Error (RMSLE).

Prediction of High Level Ozone Concentration in Seoul by Using Multivariate Statistical Analyses (다변량 통계분석을 이용한 서울시 고농도 오존의 예측에 관한 연구)

  • 허정숙;김동술
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.9 no.3
    • /
    • pp.207-215
    • /
    • 1993
  • In order to statistically predict $O_3$ levels in Seoul, the study used the TMS (telemeted air monitoring system) data from the Department of Environment, which have monitored at 20 sites in 1989 and 1990. Each data in each site was characterized by 6 major criteria pollutants ($SO_2, TSP, CO, NO_2, THC, and O_3$) and 2 meteorological parameters, such as wind speed and wind direction. To select proper variables and to determine each pollutant's behavior, univariate statistical analyses were extensively studied in the beginning, and then various applied statistical techniques like cluster analysis, regression analysis, and expert system have been intensively examined. For the initial study of high level $O_3$ prediction, the raw data set in each site was separated into 2 group based on 60 ppb $O_3$ level. A hierarchical cluster analysis was applied to classify the group based on 60 ppb $O_3$ into small calsses. Each class in each site has its own pattern. Next, multiple regression for each class was repeatedly applied to determine an $O_3$ prediction submodel and to determine outliers in each class based on a certain level of standardized redisual. Thus, a prediction submodel for each homogeneous class could be obtained. The study was extended to model $O_3$ prediction for both on-time basis and 1-hr after basis. Finally, an expect system was used to build a unified classification rule based on examples of the homogenous classes for all of sites. Thus, a concept of high level $O_3$ prediction model was developed for one of $O_3$ alert systems.

  • PDF

A Study on the Prediction Index for Chart Success of Digital Music Contents based on Analysis of Social Data (소셜 데이터 분석을 통한 음원 흥행 예측 지표 연구)

  • Kim, Ga-Yeon;Kim, Myoung-Jun
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1105-1114
    • /
    • 2018
  • The growth rate of the domestic digital music contents market has been remarkable recently. Accordingly, the necessity of prediction for chart success of digital music contents has grown. This paper proposes prediction indexes for chart success of digital music contents through analysis of correlation between social data such as Internet news, SNS and entry rankings in Melon's weekly music charts. We collected a total of 10 social data items for each male and female artist, and executed cluster analysis. Through this, we found meaningful prediction indexes for chart success of digital music contents for each male and female artist.

An Accurate Stock Price Forecasting with Ensemble Learning Based on Sentiment of News (뉴스 감성 앙상블 학습을 통한 주가 예측기의 성능 향상)

  • Kim, Ha-Eun;Park, Young-Wook;Yoo, Si-eun;Jeong, Seong-Woo;Yoo, Joonhyuk
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.1
    • /
    • pp.51-58
    • /
    • 2022
  • Various studies have been conducted from the past to the present because stock price forecasts provide stability in the national economy and huge profits to investors. Recently, there have been many studies that suggest stock price prediction models using various input data such as macroeconomic indicators and emotional analysis. However, since each study was conducted individually, it is difficult to objectively compare each method, and studies on their impact on stock price prediction are still insufficient. In this paper, the effect of input data currently mainly used on the stock price is evaluated through the predicted value of the deep learning model and the error rate of the actual stock price. In addition, unlike most papers in emotional analysis, emotional analysis using the news body was conducted, and a method of supplementing the results of each emotional analysis is proposed through three emotional analysis models. Through experiments predicting Microsoft's revised closing price, the results of emotional analysis were found to be the most important factor in stock price prediction. Especially, when all of input data is used, error rate of ensembled sentiment analysis model is reduced by 58% compared to the baseline.

Mean Streamline Analysis for Performance Prediction of Cross- Flow Fans

  • Kim, Jae-Won;Oh, Hyoung-Woo
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.8
    • /
    • pp.1428-1434
    • /
    • 2004
  • This paper presents the mean streamline analysis using the empirical loss correlations for performance prediction of cross-flow fans. Comparison of overall performance predictions with test data of a cross-flow fan system with a simplified vortex wall scroll casing and with the published experimental characteristics for a cross-flow fan has been carried out to demonstrate the accuracy of the proposed method. Predicted performance curves by the present mean streamline analysis agree well with experimental data for two different cross-flow fans over the normal operating conditions. The prediction method presented herein can be used efficiently as a tool for the preliminary design and performance analysis of general-purpose cross-flow fans.

An Exploratory Study for Decreasing Error of Prediction Value of Recommended System on User Based

  • Lee, Hee-Choon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.77-86
    • /
    • 2006
  • This study is to investigate the error of prediction value with related variables from the recommended system and to examine the error of prediction value with related variables. To decrease the error on the collaborative recommended system on user based, this research explored the effects on the prediction related response pair between raters' demographic variables and Pearson's coefficient and sparsity. The result shows comparative analysis between existing error of prediction value and conditioned one.

  • PDF