• Title/Summary/Keyword: Prediction performance

Search Result 5,462, Processing Time 0.033 seconds

A Machine Learning-based Total Production Time Prediction Method for Customized-Manufacturing Companies (주문생산 기업을 위한 기계학습 기반 총생산시간 예측 기법)

  • Park, Do-Myung;Choi, HyungRim;Park, Byung-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.177-190
    • /
    • 2021
  • Due to the development of the fourth industrial revolution technology, efforts are being made to improve areas that humans cannot handle by utilizing artificial intelligence techniques such as machine learning. Although on-demand production companies also want to reduce corporate risks such as delays in delivery by predicting total production time for orders, they are having difficulty predicting this because the total production time is all different for each order. The Theory of Constraints (TOC) theory was developed to find the least efficient areas to increase order throughput and reduce order total cost, but failed to provide a forecast of total production time. Order production varies from order to order due to various customer needs, so the total production time of individual orders can be measured postmortem, but it is difficult to predict in advance. The total measured production time of existing orders is also different, which has limitations that cannot be used as standard time. As a result, experienced managers rely on persimmons rather than on the use of the system, while inexperienced managers use simple management indicators (e.g., 60 days total production time for raw materials, 90 days total production time for steel plates, etc.). Too fast work instructions based on imperfections or indicators cause congestion, which leads to productivity degradation, and too late leads to increased production costs or failure to meet delivery dates due to emergency processing. Failure to meet the deadline will result in compensation for delayed compensation or adversely affect business and collection sectors. In this study, to address these problems, an entity that operates an order production system seeks to find a machine learning model that estimates the total production time of new orders. It uses orders, production, and process performance for materials used for machine learning. We compared and analyzed OLS, GLM Gamma, Extra Trees, and Random Forest algorithms as the best algorithms for estimating total production time and present the results.

Landslide Susceptibility Mapping Using Deep Neural Network and Convolutional Neural Network (Deep Neural Network와 Convolutional Neural Network 모델을 이용한 산사태 취약성 매핑)

  • Gong, Sung-Hyun;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1723-1735
    • /
    • 2022
  • Landslides are one of the most prevalent natural disasters, threating both humans and property. Also landslides can cause damage at the national level, so effective prediction and prevention are essential. Research to produce a landslide susceptibility map with high accuracy is steadily being conducted, and various models have been applied to landslide susceptibility analysis. Pixel-based machine learning models such as frequency ratio models, logistic regression models, ensembles models, and Artificial Neural Networks have been mainly applied. Recent studies have shown that the kernel-based convolutional neural network (CNN) technique is effective and that the spatial characteristics of input data have a significant effect on the accuracy of landslide susceptibility mapping. For this reason, the purpose of this study is to analyze landslide vulnerability using a pixel-based deep neural network model and a patch-based convolutional neural network model. The research area was set up in Gangwon-do, including Inje, Gangneung, and Pyeongchang, where landslides occurred frequently and damaged. Landslide-related factors include slope, curvature, stream power index (SPI), topographic wetness index (TWI), topographic position index (TPI), timber diameter, timber age, lithology, land use, soil depth, soil parent material, lineament density, fault density, normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were used. Landslide-related factors were built into a spatial database through data preprocessing, and landslide susceptibility map was predicted using deep neural network (DNN) and CNN models. The model and landslide susceptibility map were verified through average precision (AP) and root mean square errors (RMSE), and as a result of the verification, the patch-based CNN model showed 3.4% improved performance compared to the pixel-based DNN model. The results of this study can be used to predict landslides and are expected to serve as a scientific basis for establishing land use policies and landslide management policies.

Comparative assessment and uncertainty analysis of ensemble-based hydrologic data assimilation using airGRdatassim (airGRdatassim을 이용한 앙상블 기반 수문자료동화 기법의 비교 및 불확실성 평가)

  • Lee, Garim;Lee, Songhee;Kim, Bomi;Woo, Dong Kook;Noh, Seong Jin
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.10
    • /
    • pp.761-774
    • /
    • 2022
  • Accurate hydrologic prediction is essential to analyze the effects of drought, flood, and climate change on flow rates, water quality, and ecosystems. Disentangling the uncertainty of the hydrological model is one of the important issues in hydrology and water resources research. Hydrologic data assimilation (DA), a technique that updates the status or parameters of a hydrological model to produce the most likely estimates of the initial conditions of the model, is one of the ways to minimize uncertainty in hydrological simulations and improve predictive accuracy. In this study, the two ensemble-based sequential DA techniques, ensemble Kalman filter, and particle filter are comparatively analyzed for the daily discharge simulation at the Yongdam catchment using airGRdatassim. The results showed that the values of Kling-Gupta efficiency (KGE) were improved from 0.799 in the open loop simulation to 0.826 in the ensemble Kalman filter and to 0.933 in the particle filter. In addition, we analyzed the effects of hyper-parameters related to the data assimilation methods such as precipitation and potential evaporation forcing error parameters and selection of perturbed and updated states. For the case of forcing error conditions, the particle filter was superior to the ensemble in terms of the KGE index. The size of the optimal forcing noise was relatively smaller in the particle filter compared to the ensemble Kalman filter. In addition, with more state variables included in the updating step, performance of data assimilation improved, implicating that adequate selection of updating states can be considered as a hyper-parameter. The simulation experiments in this study implied that DA hyper-parameters needed to be carefully optimized to exploit the potential of DA methods.

A study on the derivation and evaluation of flow duration curve (FDC) using deep learning with a long short-term memory (LSTM) networks and soil water assessment tool (SWAT) (LSTM Networks 딥러닝 기법과 SWAT을 이용한 유량지속곡선 도출 및 평가)

  • Choi, Jung-Ryel;An, Sung-Wook;Choi, Jin-Young;Kim, Byung-Sik
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1107-1118
    • /
    • 2021
  • Climate change brought on by global warming increased the frequency of flood and drought on the Korean Peninsula, along with the casualties and physical damage resulting therefrom. Preparation and response to these water disasters requires national-level planning for water resource management. In addition, watershed-level management of water resources requires flow duration curves (FDC) derived from continuous data based on long-term observations. Traditionally, in water resource studies, physical rainfall-runoff models are widely used to generate duration curves. However, a number of recent studies explored the use of data-based deep learning techniques for runoff prediction. Physical models produce hydraulically and hydrologically reliable results. However, these models require a high level of understanding and may also take longer to operate. On the other hand, data-based deep-learning techniques offer the benefit if less input data requirement and shorter operation time. However, the relationship between input and output data is processed in a black box, making it impossible to consider hydraulic and hydrological characteristics. This study chose one from each category. For the physical model, this study calculated long-term data without missing data using parameter calibration of the Soil Water Assessment Tool (SWAT), a physical model tested for its applicability in Korea and other countries. The data was used as training data for the Long Short-Term Memory (LSTM) data-based deep learning technique. An anlysis of the time-series data fond that, during the calibration period (2017-18), the Nash-Sutcliffe Efficiency (NSE) and the determinanation coefficient for fit comparison were high at 0.04 and 0.03, respectively, indicating that the SWAT results are superior to the LSTM results. In addition, the annual time-series data from the models were sorted in the descending order, and the resulting flow duration curves were compared with the duration curves based on the observed flow, and the NSE for the SWAT and the LSTM models were 0.95 and 0.91, respectively, and the determination coefficients were 0.96 and 0.92, respectively. The findings indicate that both models yield good performance. Even though the LSTM requires improved simulation accuracy in the low flow sections, the LSTM appears to be widely applicable to calculating flow duration curves for large basins that require longer time for model development and operation due to vast data input, and non-measured basins with insufficient input data.

Development of Deep-Learning-Based Models for Predicting Groundwater Levels in the Middle-Jeju Watershed, Jeju Island (딥러닝 기법을 이용한 제주도 중제주수역 지하수위 예측 모델개발)

  • Park, Jaesung;Jeong, Jiho;Jeong, Jina;Kim, Ki-Hong;Shin, Jaehyeon;Lee, Dongyeop;Jeong, Saebom
    • The Journal of Engineering Geology
    • /
    • v.32 no.4
    • /
    • pp.697-723
    • /
    • 2022
  • Data-driven models to predict groundwater levels 30 days in advance were developed for 12 groundwater monitoring stations in the middle-Jeju watershed, Jeju Island. Stacked long short-term memory (stacked-LSTM), a deep learning technique suitable for time series forecasting, was used for model development. Daily time series data from 2001 to 2022 for precipitation, groundwater usage amount, and groundwater level were considered. Various models were proposed that used different combinations of the input data types and varying lengths of previous time series data for each input variable. A general procedure for deep-learning-based model development is suggested based on consideration of the comparative validation results of the tested models. A model using precipitation, groundwater usage amount, and previous groundwater level data as input variables outperformed any model neglecting one or more of these data categories. Using extended sequences of these past data improved the predictions, possibly owing to the long delay time between precipitation and groundwater recharge, which results from the deep groundwater level in Jeju Island. However, limiting the range of considered groundwater usage data that significantly affected the groundwater level fluctuation (rather than using all the groundwater usage data) improved the performance of the predictive model. The developed models can predict the future groundwater level based on the current amount of precipitation and groundwater use. Therefore, the models provide information on the soundness of the aquifer system, which will help to prepare management plans to maintain appropriate groundwater quantities.

Prediction of Acer pictum subsp. mono Distribution using Bioclimatic Predictor Based on SSP Scenario Detailed Data (SSP 시나리오 상세화 자료 기반 생태기후지수를 활용한 고로쇠나무 분포 예측)

  • Kim, Whee-Moon;Kim, Chaeyoung;Cho, Jaepil;Hur, Jina;Song, Wonkyong
    • Ecology and Resilient Infrastructure
    • /
    • v.9 no.3
    • /
    • pp.163-173
    • /
    • 2022
  • Climate change is a key factor that greatly influences changes in the biological seasons and geographical distribution of species. In the ecological field, the BioClimatic predictor (BioClim), which is most related to the physiological characteristics of organisms, is used for vulnerability assessment. However, BioClim values are not provided other than the future period climate average values for each GCM for the Shared Socio-economic Pathways (SSPs) scenario. In this study, BioClim data suitable for domestic conditions was produced using 1 km resolution SSPs scenario detailed data produced by Rural Development Administration, and based on the data, a species distribution model was applied to mainly grow in southern, Gyeongsangbuk-do, Gangwon-do and humid regions. Appropriate habitat distributions were predicted every 30 years for the base years (1981 - 2010) and future years (2011 - 2100) of the Acer pictum subsp. mono. Acer pictum subsp. mono appearance data were collected from a total of 819 points through the national natural environment survey data. In order to improve the performance of the MaxEnt model, the parameters of the model (LQH-1.5) were optimized, and 7 detailed biolicm indices and 5 topographical indices were applied to the MaxEnt model. Drainage, Annual Precipitation (Bio12), and Slope significantly contributed to the distribution of Acer pictum subsp. mono in Korea. As a result of reflecting the growth characteristics that favor moist and fertile soil, the influence of climatic factors was not significant. Accordingly, in the base year, the suitable habitat for a high level of Acer pictum subsp. mono is 3.41% of the area of Korea, and in the near future (2011 - 2040) and far future (2071 - 2100), SSP1-2.6 accounts for 0.01% and 0.02%, gradually decreasing. However, in SSP5-8.5, it was 0.01% and 0.72%, respectively, showing a tendency to decrease in the near future compared to the base year, but to gradually increase toward the far future. This study confirms the future distribution of vegetation that is more easily adapted to climate change, and has significance as a basic study that can be used for future forest restoration of climate change-adapted species.

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

State of Health and State of Charge Estimation of Li-ion Battery for Construction Equipment based on Dual Extended Kalman Filter (이중확장칼만필터(DEKF)를 기반한 건설장비용 리튬이온전지의 State of Charge(SOC) 및 State of Health(SOH) 추정)

  • Hong-Ryun Jung;Jun Ho Kim;Seung Woo Kim;Jong Hoon Kim;Eun Jin Kang;Jeong Woo Yun
    • Journal of the Microelectronics and Packaging Society
    • /
    • v.31 no.1
    • /
    • pp.16-22
    • /
    • 2024
  • Along with the high interest in electric vehicles and new renewable energy, there is a growing demand to apply lithium-ion batteries in the construction equipment industry. The capacity of heavy construction equipment that performs various tasks at construction sites is rapidly decreasing. Therefore, it is essential to accurately predict the state of batteries such as SOC (State of Charge) and SOH (State of Health). In this paper, the errors between actual electrochemical measurement data and estimated data were compared using the Dual Extended Kalman Filter (DEKF) algorithm that can estimate SOC and SOH at the same time. The prediction of battery charge state was analyzed by measuring OCV at SOC 5% intervals under 0.2C-rate conditions after the battery cell was fully charged, and the degradation state of the battery was predicted after 50 cycles of aging tests under various C-rate (0.2, 0.3, 0.5, 1.0, 1.5C rate) conditions. It was confirmed that the SOC and SOH estimation errors using DEKF tended to increase as the C-rate increased. It was confirmed that the SOC estimation using DEKF showed less than 6% at 0.2, 0.5, and 1C-rate. In addition, it was confirmed that the SOH estimation results showed good performance within the maximum error of 1.0% and 1.3% at 0.2 and 0.3C-rate, respectively. Also, it was confirmed that the estimation error also increased from 1.5% to 2% as the C-rate increased from 0.5 to 1.5C-rate. However, this result shows that all SOH estimation results using DEKF were excellent within about 2%.

Content-based Recommendation Based on Social Network for Personalized News Services (개인화된 뉴스 서비스를 위한 소셜 네트워크 기반의 콘텐츠 추천기법)

  • Hong, Myung-Duk;Oh, Kyeong-Jin;Ga, Myung-Hyun;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.57-71
    • /
    • 2013
  • Over a billion people in the world generate new news minute by minute. People forecasts some news but most news are from unexpected events such as natural disasters, accidents, crimes. People spend much time to watch a huge amount of news delivered from many media because they want to understand what is happening now, to predict what might happen in the near future, and to share and discuss on the news. People make better daily decisions through watching and obtaining useful information from news they saw. However, it is difficult that people choose news suitable to them and obtain useful information from the news because there are so many news media such as portal sites, broadcasters, and most news articles consist of gossipy news and breaking news. User interest changes over time and many people have no interest in outdated news. From this fact, applying users' recent interest to personalized news service is also required in news service. It means that personalized news service should dynamically manage user profiles. In this paper, a content-based news recommendation system is proposed to provide the personalized news service. For a personalized service, user's personal information is requisitely required. Social network service is used to extract user information for personalization service. The proposed system constructs dynamic user profile based on recent user information of Facebook, which is one of social network services. User information contains personal information, recent articles, and Facebook Page information. Facebook Pages are used for businesses, organizations and brands to share their contents and connect with people. Facebook users can add Facebook Page to specify their interest in the Page. The proposed system uses this Page information to create user profile, and to match user preferences to news topics. However, some Pages are not directly matched to news topic because Page deals with individual objects and do not provide topic information suitable to news. Freebase, which is a large collaborative database of well-known people, places, things, is used to match Page to news topic by using hierarchy information of its objects. By using recent Page information and articles of Facebook users, the proposed systems can own dynamic user profile. The generated user profile is used to measure user preferences on news. To generate news profile, news category predefined by news media is used and keywords of news articles are extracted after analysis of news contents including title, category, and scripts. TF-IDF technique, which reflects how important a word is to a document in a corpus, is used to identify keywords of each news article. For user profile and news profile, same format is used to efficiently measure similarity between user preferences and news. The proposed system calculates all similarity values between user profiles and news profiles. Existing methods of similarity calculation in vector space model do not cover synonym, hypernym and hyponym because they only handle given words in vector space model. The proposed system applies WordNet to similarity calculation to overcome the limitation. Top-N news articles, which have high similarity value for a target user, are recommended to the user. To evaluate the proposed news recommendation system, user profiles are generated using Facebook account with participants consent, and we implement a Web crawler to extract news information from PBS, which is non-profit public broadcasting television network in the United States, and construct news profiles. We compare the performance of the proposed method with that of benchmark algorithms. One is a traditional method based on TF-IDF. Another is 6Sub-Vectors method that divides the points to get keywords into six parts. Experimental results demonstrate that the proposed system provide useful news to users by applying user's social network information and WordNet functions, in terms of prediction error of recommended news.

A Study on Intelligent Value Chain Network System based on Firms' Information (기업정보 기반 지능형 밸류체인 네트워크 시스템에 관한 연구)

  • Sung, Tae-Eung;Kim, Kang-Hoe;Moon, Young-Su;Lee, Ho-Shin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.67-88
    • /
    • 2018
  • Until recently, as we recognize the significance of sustainable growth and competitiveness of small-and-medium sized enterprises (SMEs), governmental support for tangible resources such as R&D, manpower, funds, etc. has been mainly provided. However, it is also true that the inefficiency of support systems such as underestimated or redundant support has been raised because there exist conflicting policies in terms of appropriateness, effectiveness and efficiency of business support. From the perspective of the government or a company, we believe that due to limited resources of SMEs technology development and capacity enhancement through collaboration with external sources is the basis for creating competitive advantage for companies, and also emphasize value creation activities for it. This is why value chain network analysis is necessary in order to analyze inter-company deal relationships from a series of value chains and visualize results through establishing knowledge ecosystems at the corporate level. There exist Technology Opportunity Discovery (TOD) system that provides information on relevant products or technology status of companies with patents through retrievals over patent, product, or company name, CRETOP and KISLINE which both allow to view company (financial) information and credit information, but there exists no online system that provides a list of similar (competitive) companies based on the analysis of value chain network or information on potential clients or demanders that can have business deals in future. Therefore, we focus on the "Value Chain Network System (VCNS)", a support partner for planning the corporate business strategy developed and managed by KISTI, and investigate the types of embedded network-based analysis modules, databases (D/Bs) to support them, and how to utilize the system efficiently. Further we explore the function of network visualization in intelligent value chain analysis system which becomes the core information to understand industrial structure ystem and to develop a company's new product development. In order for a company to have the competitive superiority over other companies, it is necessary to identify who are the competitors with patents or products currently being produced, and searching for similar companies or competitors by each type of industry is the key to securing competitiveness in the commercialization of the target company. In addition, transaction information, which becomes business activity between companies, plays an important role in providing information regarding potential customers when both parties enter similar fields together. Identifying a competitor at the enterprise or industry level by using a network map based on such inter-company sales information can be implemented as a core module of value chain analysis. The Value Chain Network System (VCNS) combines the concepts of value chain and industrial structure analysis with corporate information simply collected to date, so that it can grasp not only the market competition situation of individual companies but also the value chain relationship of a specific industry. Especially, it can be useful as an information analysis tool at the corporate level such as identification of industry structure, identification of competitor trends, analysis of competitors, locating suppliers (sellers) and demanders (buyers), industry trends by item, finding promising items, finding new entrants, finding core companies and items by value chain, and recognizing the patents with corresponding companies, etc. In addition, based on the objectivity and reliability of the analysis results from transaction deals information and financial data, it is expected that value chain network system will be utilized for various purposes such as information support for business evaluation, R&D decision support and mid-term or short-term demand forecasting, in particular to more than 15,000 member companies in Korea, employees in R&D service sectors government-funded research institutes and public organizations. In order to strengthen business competitiveness of companies, technology, patent and market information have been provided so far mainly by government agencies and private research-and-development service companies. This service has been presented in frames of patent analysis (mainly for rating, quantitative analysis) or market analysis (for market prediction and demand forecasting based on market reports). However, there was a limitation to solving the lack of information, which is one of the difficulties that firms in Korea often face in the stage of commercialization. In particular, it is much more difficult to obtain information about competitors and potential candidates. In this study, the real-time value chain analysis and visualization service module based on the proposed network map and the data in hands is compared with the expected market share, estimated sales volume, contact information (which implies potential suppliers for raw material / parts, and potential demanders for complete products / modules). In future research, we intend to carry out the in-depth research for further investigating the indices of competitive factors through participation of research subjects and newly developing competitive indices for competitors or substitute items, and to additively promoting with data mining techniques and algorithms for improving the performance of VCNS.