• 제목/요약/키워드: Data analysis & prediction

검색결과 4,189건 처리시간 0.032초

데이터간 의미 분석을 위한 R기반의 데이터 가중치 및 신경망기반의 데이터 예측 모형에 관한 연구 (A Novel Data Prediction Model using Data Weights and Neural Network based on R for Meaning Analysis between Data)

  • 정세훈;김종찬;심춘보
    • 한국멀티미디어학회논문지
    • /
    • 제18권4호
    • /
    • pp.524-532
    • /
    • 2015
  • All data created in BigData times is included potentially meaning and correlation in data. A variety of data during a day in all society sectors has become created and stored. Research areas in analysis and grasp meaning between data is proceeding briskly. Especially, accuracy of meaning prediction and data imbalance problem between data for analysis is part in course of something important in data analysis field. In this paper, we proposed data prediction model based on data weights and neural network using R for meaning analysis between data. Proposed data prediction model is composed of classification model and analysis model. Classification model is working as weights application of normal distribution and optimum independent variable selection of multiple regression analysis. Analysis model role is increased prediction accuracy of output variable through neural network. Performance evaluation result, we were confirmed superiority of prediction model so that performance of result prediction through primitive data was measured 87.475% by proposed data prediction model.

Analysis Model Evaluation based on IoT Data and Machine Learning Algorithm for Prediction of Acer Mono Sap Liquid Water

  • Lee, Han Sung;Jung, Se Hoon
    • 한국멀티미디어학회논문지
    • /
    • 제23권10호
    • /
    • pp.1286-1295
    • /
    • 2020
  • It has been increasingly difficult to predict the amounts of Acer mono sap to be collected due to droughts and cold waves caused by recent climate changes with few studies conducted on the prediction of its collection volume. This study thus set out to propose a Big Data prediction system based on meteorological information for the collection of Acer mono sap. The proposed system would analyze collected data and provide managers with a statistical chart of prediction values regarding climate factors to affect the amounts of Acer mono sap to be collected, thus enabling efficient work. It was designed based on Hadoop for data collection, treatment and analysis. The study also analyzed and proposed an optimal prediction model for climate conditions to influence the volume of Acer mono sap to be collected by applying a multiple regression analysis model based on Hadoop and Mahout.

외식프랜차이즈기업 부실예측모형 예측력 평가 (Evaluating Distress Prediction Models for Food Service Franchise Industry)

  • 김시중
    • 유통과학연구
    • /
    • 제17권11호
    • /
    • pp.73-79
    • /
    • 2019
  • Purpose: The purpose of this study was evaluated to compare the predictive power of distress prediction models by using discriminant analysis method and logit analysis method for food service franchise industry in Korea. Research design, data and methodology: Forty-six food service franchise industry with high sales volume in the 2017 were selected as the sample food service franchise industry for analysis. The fourteen financial ratios for analysis were calculated from the data in the 2017 statement of financial position and income statement of forty-six food service franchise industry in Korea. The fourteen financial ratios were used as sample data and analyzed by t-test. As a result seven statistically significant independent variables were chosen. The analysis method of the distress prediction model was performed by logit analysis and multiple discriminant analysis. Results: The difference between the average value of fourteen financial ratios of forty-six food service franchise industry was tested through t-test in order to extract variables that are classified as top-leveled and failure food service franchise industry among the financial ratios. As a result of the univariate test appears that the variables which differentiate the top-leveled food service franchise industry to failure food service industry are income to stockholders' equity, operating income to sales, current ratio, net income to assets, cash flows from operating activities, growth rate of operating income, and total assets turnover. The statistical significances of the seven financial ratio independent variables were also confirmed by logit analysis and discriminant analysis. Conclusions: The analysis results of the prediction accuracy of each distress prediction model in this study showed that the forecast accuracy of the prediction model by the discriminant analysis method was 84.8% and 89.1% by the logit analysis method, indicating that the logit analysis method has higher distress predictability than the discriminant analysis method. Comparing the previous distress prediction capability, which ranges from 75% to 85% by discriminant analysis and logit analysis, this study's prediction capacity, which is 84.8% in the discriminant analysis, and 89.1% in logit analysis, is found to belong to the range of previous study's prediction capacity range and is considered high number.

SNS와 뉴스기사의 감성분석과 기계학습을 이용한 주가예측 모형 비교 연구 (A Comparative Study between Stock Price Prediction Models Using Sentiment Analysis and Machine Learning Based on SNS and News Articles)

  • 김동영;박제원;최재현
    • 한국IT서비스학회지
    • /
    • 제13권3호
    • /
    • pp.221-233
    • /
    • 2014
  • Because people's interest of the stock market has been increased with the development of economy, a lot of studies have been going to predict fluctuation of stock prices. Latterly many studies have been made using scientific and technological method among the various forecasting method, and also data using for study are becoming diverse. So, in this paper we propose stock prices prediction models using sentiment analysis and machine learning based on news articles and SNS data to improve the accuracy of prediction of stock prices. Stock prices prediction models that we propose are generated through the four-step process that contain data collection, sentiment dictionary construction, sentiment analysis, and machine learning. The data have been collected to target newspapers related to economy in the case of news article and to target twitter in the case of SNS data. Sentiment dictionary was built using news articles among the collected data, and we utilize it to process sentiment analysis. In machine learning phase, we generate prediction models using various techniques of classification and the data that was made through sentiment analysis. After generating prediction models, we conducted 10-fold cross-validation to measure the performance of they. The experimental result showed that accuracy is over 80% in a number of ways and F1 score is closer to 0.8. The result can be seen as significantly enhanced result compared with conventional researches utilizing opinion mining or data mining techniques.

인공신경망을 이용한 기업도산 예측 - IMF후 국내 상장회사를 중심으로 - (A Neural Network Model for Bankruptcy Prediction -Domestic KSE listed Bankrupted Companies after the foreign exchange crisis in 1997)

  • 정유석;이현수;채영일;서영호
    • 한국품질경영학회:학술대회논문집
    • /
    • 한국품질경영학회 2004년도 품질경영모델을 통한 가치 창출
    • /
    • pp.655-673
    • /
    • 2004
  • This paper is concerned with analysing the bankruptcy prediction power of three models: Multivariate Discriminant Analysis(MDA ), Logit Analysis, Neural Network. The after-crisis bankrupted companies were limited to the research data and the listed companies belonging to manufacturing industry was limited to the research data so as to improve prediction accuracy and validity of the model. In order to assure meaningful bankruptcy prediction, training data and testing data were not extracted within the corresponding period. The result is that prediction accuracy of neural network model is more excellent than that of logit analysis and MDA model when considering that execution of testing data was followed by execution of training data.

  • PDF

Analyzing Customer Management Data by Data Mining: Case Study on Chum Prediction Models for Insurance Company in Korea

  • Cho, Mee-Hye;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1007-1018
    • /
    • 2008
  • The purpose of this case study is to demonstrate database-marketing management. First, we explore original variables for insurance customer's data, modify them if necessary, and go through variable selection process before analysis. Then, we develop churn prediction models using logistic regression, neural network and SVM analysis. We also compare these three data mining models in terms of misclassification rate.

  • PDF

TANFIS Classifier Integrated Efficacious Aassistance System for Heart Disease Prediction using CNN-MDRP

  • Bhaskaru, O.;Sreedevi, M.
    • International Journal of Computer Science & Network Security
    • /
    • 제22권10호
    • /
    • pp.171-176
    • /
    • 2022
  • A dramatic rise in the number of people dying from heart disease has prompted efforts to find a way to identify it sooner using efficient approaches. A variety of variables contribute to the condition and even hereditary factors. The current estimate approaches use an automated diagnostic system that fails to attain a high level of accuracy because it includes irrelevant dataset information. This paper presents an effective neural network with convolutional layers for classifying clinical data that is highly class-imbalanced. Traditional approaches rely on massive amounts of data rather than precise predictions. Data must be picked carefully in order to achieve an earlier prediction process. It's a setback for analysis if the data obtained is just partially complete. However, feature extraction is a major challenge in classification and prediction since increased data increases the training time of traditional machine learning classifiers. The work integrates the CNN-MDRP classifier (convolutional neural network (CNN)-based efficient multimodal disease risk prediction with TANFIS (tuned adaptive neuro-fuzzy inference system) for earlier accurate prediction. Perform data cleaning by transforming partial data to informative data from the dataset in this project. The recommended TANFIS tuning parameters are then improved using a Laplace Gaussian mutation-based grasshopper and moth flame optimization approach (LGM2G). The proposed approach yields a prediction accuracy of 98.40 percent when compared to current algorithms.

앙상블 자료동화 시스템에서 ASCAT 해상풍 자료동화가 분석장에 미치는 효과 분석 (Investigation of Analysis Effects of ASCAT Data Assimilation within KIAPS-LETKF System)

  • 조영순;임수정;권인혁;한현준
    • 대기
    • /
    • 제28권3호
    • /
    • pp.263-272
    • /
    • 2018
  • The high-resolution ocean surface wind vector produced by scatterometer was assimilated within the Local Ensemble Transform Kalman Filter (LETKF) in Korea Institute of Atmospheric Prediction Systems (KIAPS). The Advanced Scatterometer (ASCAT) on Metop-A/B wind data was processed in the KIAPS Package for Observation Processing (KPOP), and a module capable of processing surface wind observation was implemented in the LETKF system. The LETKF data assimilation cycle for evaluating the performance improvement due to ASCAT observation was carried out for approximately 20 days from June through July 2017 when Typhoon Nepartak was present. As a result, we have found that the performance of ASCAT wind vector has a clear and beneficial effect on the data assimilation cycle. It has reduced analysis errors of wind, temperature, and humidity, as well as analysis errors of lower troposphere wind. Furthermore, by the assimilation of the ASCAT wind observation, the initial condition of the model described the typhoon structure more accurately and improved the typhoon track prediction skill. Therefore, we can expect the analysis field of LETKF will be improved if the Scatterometer wind observation is added.

데이터마이닝 기법을 이용한 제조 공정내의 불량항목별 예측방법 (Defect Type Prediction Method in Manufacturing Process Using Data Mining Technique)

  • 변성규;강창욱;심성보
    • 산업경영시스템학회지
    • /
    • 제27권2호
    • /
    • pp.10-16
    • /
    • 2004
  • Data mining technique is the exploration and analysis, by automatic or semiautomatic means, of large quantities of data in order to discover meaningful patterns and rules. This paper uses a data mining technique for the prediction of defect types in manufacturing Process. The Purpose of this Paper is to model the recognition of defect type Patterns and Prediction of each defect type before it occurs in manufacturing process. The proposed model consists of data handling, defect type analysis, and defect type prediction stages. The performance measurement shows that it is higher in prediction accuracy than logistic regression model.

데이터마이닝을 이용한 관측적 침하해석의 신뢰성 연구 (A Study on the Reliability of Observational Settlement Analysis Using Data Mining)

  • 우철웅;장병욱
    • 한국농공학회지
    • /
    • 제45권6호
    • /
    • pp.183-193
    • /
    • 2003
  • Most construction works on the soft ground adopt instrumentation to manage settlement and stability of the embankment. The rapid progress of the information technologies and the digital data acquisition on the soft ground instrumentation has led to the fast-growing amount of data. Although valuable information about the behaviour of the soft ground may be hiding behind the data, most of the data are used restrictedly only for the management of settlement and stability. One of the critical issues on soft ground instrumentation is the long-term settlement prediction. Some observational settlement analysis methods are used for this purpose. But the reliability of the analysis results is remained in vague. The knowledge could be discovered from a large volume of experiences on the observational settlement analysis. In this article, we present a database to store settlement records and data mining procedure. A large volume of knowledge about observational settlement prediction were collected from the database by applying the filtering algorithm and knowledge discovery algorithm. Statistical analysis revealed that the reliability of observational settlement analysis depends on stay duration and estimated degree of consolidation.