• Title/Summary/Keyword: Prediction Model

Search Result 11,259, Processing Time 0.043 seconds

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Development of Stand Yield Table Based on Current Growth Characteristics of Chamaecyparis obtusa Stands (현실임분 생장특성에 의한 편백 임분수확표 개발)

  • Jung, Su Young;Lee, Kwang Soo;Lee, Ho Sang;Ji Bae, Eun;Park, Jun Hyung;Ko, Chi-Ung
    • Journal of Korean Society of Forest Science
    • /
    • v.109 no.4
    • /
    • pp.477-483
    • /
    • 2020
  • We constructed a stand yield table for Chamaecyparis obtusa based on data from an actual forest. The previous stand yield table had a number of disadvantages because it was based on actual forest information. In the present study we used data from more than 200 sampling plots in a stand of Chamaecyparis obtusa. The analysis included theestimation, recovery and prediction of the distribution of values for diameter at breast height (DBH), and the result is a valuable process for the preparation ofstand yield tables. The DBH distribution model uses a Weibull function, and the site index (base age: 30 years), the standard for assessing forest productivity, was derived using the Chapman-Richards formula. Several estimation formulas for the preparation of the stand yield table were considered for the fitness index, and the optimal formula was chosen. The analysis shows that the site index is in the range of 10 to 18 in the Chamaecyparis obtusa stand. The estimated stand volume of each sample plot was found to have an accuracy of 62%. According to the residuals analysis, the stands showed even distribution around zero, which indicates that the results are useful in the field. Comparing the table constructed in this study to the existing stand yield table, we found that our table yielded comparatively higher values for growth. This is probably because the existing analysis data used a small amount of research data that did not properly reflect. We hope that the stand yield table of Chamaecyparis obtusa, a representative species of southern regions, will be widely used for forest management. As these forests stabilize and growth progresses, we plan to construct an additional yield table applicable to the production of developed stands.

A Machine Learning-based Total Production Time Prediction Method for Customized-Manufacturing Companies (주문생산 기업을 위한 기계학습 기반 총생산시간 예측 기법)

  • Park, Do-Myung;Choi, HyungRim;Park, Byung-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.177-190
    • /
    • 2021
  • Due to the development of the fourth industrial revolution technology, efforts are being made to improve areas that humans cannot handle by utilizing artificial intelligence techniques such as machine learning. Although on-demand production companies also want to reduce corporate risks such as delays in delivery by predicting total production time for orders, they are having difficulty predicting this because the total production time is all different for each order. The Theory of Constraints (TOC) theory was developed to find the least efficient areas to increase order throughput and reduce order total cost, but failed to provide a forecast of total production time. Order production varies from order to order due to various customer needs, so the total production time of individual orders can be measured postmortem, but it is difficult to predict in advance. The total measured production time of existing orders is also different, which has limitations that cannot be used as standard time. As a result, experienced managers rely on persimmons rather than on the use of the system, while inexperienced managers use simple management indicators (e.g., 60 days total production time for raw materials, 90 days total production time for steel plates, etc.). Too fast work instructions based on imperfections or indicators cause congestion, which leads to productivity degradation, and too late leads to increased production costs or failure to meet delivery dates due to emergency processing. Failure to meet the deadline will result in compensation for delayed compensation or adversely affect business and collection sectors. In this study, to address these problems, an entity that operates an order production system seeks to find a machine learning model that estimates the total production time of new orders. It uses orders, production, and process performance for materials used for machine learning. We compared and analyzed OLS, GLM Gamma, Extra Trees, and Random Forest algorithms as the best algorithms for estimating total production time and present the results.

Prediction Study on Major Movement Paths of Otters in the Ansim-wetland Using EN-Simulator (EN-Simulator를 활용한 안심습지 일원 수달의 주요 이동경로 예측 연구)

  • Shin, Gee-Hoon;Seo, Bo-Yong;Rho, Paikho;Kim, Ji-Young;Han, Sung-Yong
    • Journal of Environmental Impact Assessment
    • /
    • v.30 no.1
    • /
    • pp.13-23
    • /
    • 2021
  • In this study, we performed a Random Walker analysis to predict the Major Movement Paths of otters. The scope of the research was a simulation analysis with a radius of 7.5 km set as the final range centered on the Ansim-wetland in Daegu City, and a field survey was used to verify the model. The number of virtual otters was set to 1,000, the number of moving steps was set to 1,000 steps per grid, and simulations were performed on a total of 841 grids. As a result of the analysis, an average of 147.6 objects arrived at the boundary point under the condition of an interval of 50 m. As a result of the simulation verification, 8 points (13.1%) were found in the area where the movement probability was very high, and 9 points (14.8%) were found in the area where the movement probability was high. On the other hand, in areas with low movement paths probabilities, there were 8 points (13.1%) in low areas and 4 points (6.6%) in very low areas. Simulation verification results In areas with high otter values, the actual otter format probability was particularly high. In addition, as a result of investigating the correlation with the otter appearance point according to the unit area of the evaluation star of the movement probability, it seems that 6.8 traces were found per unit area in the area where the movement probability is the highest. In areas where the probability of movement is low, analysis was performed at 0.1 points. On the side where otters use the major movement paths of the river area, the normal level was exceeded, and as a result, in the area, 23 (63.9%), many form traces were found, along the major movement paths of the simulation. It turned out that the actual otter inhabits. The EN-Simulator analysis can predict how spatial properties affect the likelihood of major movement paths selection, and the analytical values are used to utilize additional habitats within the major movement paths. It is judged that it can be used as basic data such as to grasp the danger area of road kill in advance and prevent it.

Numerical analysis of morphological changes by opening gates of Sejong Weir (보 개방에 의한 하도의 지형변화 과정 수치모의 분석(세종보를 중심으로))

  • Jang, Chang-Lae;Baek, Tae Hyo;Kang, Taeun;Ock, Giyoung
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.8
    • /
    • pp.629-641
    • /
    • 2021
  • In this study, a two-dimensional numerical model (Nays2DH) was applied to analyze the process of morphological changes in the river channel bed depending on the changes in the amount of flooding after fully opening the Sejong weir, which was constructed upstream of the Geum River. For this, numerical simulations were performed by assuming the flow conditions, such as a non-uniform flow (NF), unsteady flows (single flood event, SF), and a continuous flood event (CF). Here, in the cases of the SF and CF, the normalized hydrograph was calculated from real flood events, and then the hydrograph was reconfigured by the peak flow discharge according to the scenario, and then it was employed as the flow discharge at the upstream boundary condition. In this study, to quantitatively evaluate the morphological changes, we analyzed the time changes in the bed deformation the bed relief index (BRI), and we compared the aerial photographs of the study area and the numerical simulation results. As simulation results of the NF, when the steady flow discharge increases, the ratio of lower width to depth decreases and the speed of bar migration increases. The BRI initially increases, but the amount of change decreased with time. In addition, when the steady flow discharge increases, the BRI increased. In the case of SF, the speed of bar migration decreased with the change of the flow discharge. In terms of the morphological response to the peak flood discharge, the time lag also indicated. In other words, in the SF, the change of channel bed indicates a phase lag with respect to the hydraulic condition. In the result of numerical simulation of CF, the speed of bar migration depending on the peak flood discharges decreased exponentially despite the repeated flood occurrences. In addition, as in the result of SF, the phase lag indicated, and the speed of bar migration decreased exponentially. The BRI increased with time changes, but the rate of increase in the BRI was modest despite the continuous peak flooding. Through this study, the morphological changes based on the hydrological characteristics of the river were analyzed numerically, and the methodology suggested that a quantitative prediction for the river bed change according to the flow characteristic can be applied to the field.

Observation of Ice Gradient in Cheonji, Baekdu Mountain Using Modified U-Net from Landsat -5/-7/-8 Images (Landsat 위성 영상으로부터 Modified U-Net을 이용한 백두산 천지 얼음변화도 관측)

  • Lee, Eu-Ru;Lee, Ha-Seong;Park, Sun-Cheon;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1691-1707
    • /
    • 2022
  • Cheonji Lake, the caldera of Baekdu Mountain, located on the border of the Korean Peninsula and China, alternates between melting and freezing seasonally. There is a magma chamber beneath Cheonji, and variations in the magma chamber cause volcanic antecedents such as changes in the temperature and water pressure of hot spring water. Consequently, there is an abnormal region in Cheonji where ice melts quicker than in other areas, freezes late even during the freezing period, and has a high-temperature water surface. The abnormal area is a discharge region for hot spring water, and its ice gradient may be used to monitor volcanic activity. However, due to geographical, political and spatial issues, periodic observation of abnormal regions of Cheonji is limited. In this study, the degree of ice change in the optimal region was quantified using a Landsat -5/-7/-8 optical satellite image and a Modified U-Net regression model. From January 22, 1985 to December 8, 2020, the Visible and Near Infrared (VNIR) band of 83 Landsat images including anomalous regions was utilized. Using the relative spectral reflectance of water and ice in the VNIR band, unique data were generated for quantitative ice variability monitoring. To preserve as much information as possible from the visible and near-infrared bands, ice gradient was noticed by applying it to U-Net with two encoders, achieving good prediction accuracy with a Root Mean Square Error (RMSE) of 140 and a correlation value of 0.9968. Since the ice change value can be seen with high precision from Landsat images using Modified U-Net in the future may be utilized as one of the methods to monitor Baekdu Mountain's volcanic activity, and a more specific volcano monitoring system can be built.

A Study on Intelligent Skin Image Identification From Social media big data

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.191-203
    • /
    • 2022
  • In this paper, we developed a system that intelligently identifies skin image data from big data collected from social media Instagram and extracts standardized skin sample data for skin condition diagnosis and management. The system proposed in this paper consists of big data collection and analysis stage, skin image analysis stage, training data preparation stage, artificial neural network training stage, and skin image identification stage. In the big data collection and analysis stage, big data is collected from Instagram and image information for skin condition diagnosis and management is stored as an analysis result. In the skin image analysis stage, the evaluation and analysis results of the skin image are obtained using a traditional image processing technique. In the training data preparation stage, the training data were prepared by extracting the skin sample data from the skin image analysis result. And in the artificial neural network training stage, an artificial neural network AnnSampleSkin that intelligently predicts the skin image type using this training data was built up, and the model was completed through training. In the skin image identification step, skin samples are extracted from images collected from social media, and the image type prediction results of the trained artificial neural network AnnSampleSkin are integrated to intelligently identify the final skin image type. The skin image identification method proposed in this paper shows explain high skin image identification accuracy of about 92% or more, and can provide standardized skin sample image big data. The extracted skin sample set is expected to be used as standardized skin image data that is very efficient and useful for diagnosing and managing skin conditions.

Laboratory chamber test for prediction of hazardous ground conditions ahead of a TBM tunnel face using electrical resistivity survey (전기비저항 탐사 기반 TBM 터널 굴진면 전방 위험 지반 예측을 위한 실내 토조실험 연구)

  • Lee, JunHo;Kang, Minkyu;Lee, Hyobum;Choi, Hangseok
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.6
    • /
    • pp.451-468
    • /
    • 2021
  • Predicting hazardous ground conditions ahead of a TBM (Tunnel Boring Machine) tunnel face is essential for efficient and stable TBM advance. Although there have been several studies on the electrical resistivity survey method for TBM tunnelling, sufficient experimental data considering TBM advance were not established yet. Therefore, in this study, the laboratory-scale model experiments for simulating TBM excavation were carried out to analyze the applicability of an electrical resistivity survey for predicting hazardous ground conditions ahead of a TBM tunnel face. The trend of electrical resistivity during TBM advance was experimentally evaluated under various hazardous ground conditions (fault zone, seawater intruded zone, soil to rock transition zone, and rock to soil transition zone) ahead of a tunnel face. In the course of the experiments, a scale-down rock ground was provided using granite blocks to simulate the rock TBM tunnelling. Based on the experimental data, the electrical resistivity tends to decrease as the tunnel approaches the fault zone. While the seawater intruded zone follows a similar trend with the fault zone, the resistivity value of the seawater intrude zone decreased significantly compared to that of the fault zone. In case of the soil-to-rock transition zone, the electrical resistivity increases as the TBM approaches the rock with relatively high electrical resistivity. Conversely, in case of the rock-to-soil transition zone, the opposite trend was observed. That is, electrical resistivity decreases as the tunnel face approaches the rock with relatively low electrical resistivity. The experiment results represent that hazardous ground conditions (fault zone, seawater intruded zone, soil-to-rock transition zone, rock-to-soil transition zone) can be efficiently predicted by utilizing an electrical resistivity survey during TBM tunnelling.

Development of 3D Impulse Calculation Technique for Falling Down of Trees (수목 도복의 3D 충격량 산출 기법 개발)

  • Kim, Chae-Won;Kim, Choong-Sik
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.2
    • /
    • pp.1-11
    • /
    • 2023
  • This study intended to develop a technique for quantitatively and 3-dimensionally predicting the potential failure zone and impulse that may occur when trees are fall down. The main outcomes of this study are as follows. First, this study established the potential failure zone and impulse calculation formula in order to quantitatively calculate the risks generated when trees are fallen down. When estimating the potential failure zone, the calculation was performed by magnifying the height of trees by 1.5 times, reflecting the likelihood of trees falling down and slipping. With regard to the slope of a tree, the range of 360° centered on the root collar was set in the case of trees that grow upright and the range of 180° from the inclined direction was set in the case of trees that grow inclined. The angular momentum was calculated by reflecting the rotational motion from the root collar when the trees fell down, and the impulse was calculated by converting it into the linear momentum. Second, the program to calculate a potential failure zone and impulse was developed using Rhino3D and Grasshopper. This study created the 3-dimensional models of the shapes for topography, buildings, and trees using the Rhino3D, thereby connecting them to Grasshopper to construct the spatial information. The algorithm was programmed using the calculation formula in the stage of risk calculation. This calculation considered the information on the trees' growth such as the height, inclination, and weight of trees and the surrounding environment including adjacent trees, damage targets, and analysis ranges. In the stage of risk inquiry, the calculation results were visualized into a three-dimensional model by summarizing them. For instance, the risk degrees were classified into various colors to efficiently determine the dangerous trees and dangerous areas.

Success Factor in the K-Pop Music Industry: focusing on the mediated effect of Internet Memes (대중음악 흥행 요인에 대한 연구: 인터넷 밈(Internet Meme)의 매개효과를 중심으로)

  • YuJeong Sim;Minsoo Shin
    • Journal of Service Research and Studies
    • /
    • v.13 no.1
    • /
    • pp.48-62
    • /
    • 2023
  • As seen in the recent K-pop craze, the size and influence of the Korean music industry is growing even bigger. At least 6,000 songs are released a year in the Korean music market, but not many can be said to have been successful. Many studies and attempts are being made to identify the factors that make the hit music. Commercial factors such as media exposure and promotion as well as the quality of music play an important role in the commercial success of music. Recently, there have been many marketing campaigns using Internet memes in the pop music industry, and Internet memes are activities or trends that spread in various forms, such as images and videos, as cultural units that spread among people. Depending on the Internet environment and the characteristics of digital communication, contents are expanded and reproduced in the form of various memes, which causes a greater response to consumers. Previously, the phenomenon of Internet memes has occurred naturally, but artists who are aware of the marketing effects have recently used it as an element of marketing. In this paper, the mediated effect of Internet memes in relation to the success factors of popular music was analyzed, and a prediction model reflecting them was proposed. As a result of the analysis, the factors with the mediated effect of 'cover effect' and 'challenge effect' were the same. Among the internal success factors, there were mediated effects in "Singer Recognition," the genres of "POP, Dance, Ballad, Trot and Electronica," and among the external success factors, mediated effects in "Planning Company Capacity," "The Number of Music Broadcasting Programs," and "The Number of News Articles." Predictive models reflecting cover effects and challenge effects showed F1-score at 0.6889 and 0.7692, respectively. This study is meaningful in that it has collected and analyzed actual chart data and presented commercial directions that can be used in practice, and found that there are many success factors of popular music and the mediating effects of Internet memes.