• Title/Summary/Keyword: Prediction#4

Search Result 6,508, Processing Time 0.033 seconds

Evaluation of the quality of Italian Ryegrass Silages by Near Infrared Spectroscopy (근적외선 분광법을 이용한 이탈리안 라이그라스 사일리지의 품질 평가)

  • Park, Hyung-Soo;Lee, Sang-Hoon;Choi, Ki-Choon;Lim, Young-Chul;Kim, Jong-Gun;Jo, Kyu-Chea;Choi, Gi-Jun
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.32 no.3
    • /
    • pp.301-308
    • /
    • 2012
  • Near infrared reflectance spectroscopy (NIRS) has become increasingly used as a rapid and accurate method of evaluating some chemical compositions in forages. This study was carried out to explore the accuracy of near infrared spectroscopy (NIRS) for the prediction of chemical parameters of Italian ryegrass silages. A population of 267 Italian ryegrass silages representing a wide range in chemical parameters and fermentative characteristics was used in this investigation. Samples of silage were scanned at 2 nm intervals over the wavelength range 680~2,500 nm and the optical data recorded as log 1/Reflectance (log 1/R) and scanned in intact fresh condition. The spectral data were regressed against a range of chemical parameters using partial least squares (PLS) multivariate analysis in conjunction with spectral math treatments to reduced the effect of extraneous noise. The optimum calibrations were selected on the basis of the highest coefficients of determination in cross validation ($R^2$) and the lowest standard error of cross validation (SECV). The results of this study showed that NIRS predicted the chemical parameters with very high degree of accuracy. The $R^2$ and SECV were 0.98 (SECV 1.27%) for moisture, 0.88 (SECV 1.26%) for ADF, 0.84 (SECV 2.0%), 0.93 (SECV 0.96%) for CP and 0.78 (SECV 0.56), 0.81 (SECV 0.31%), 0.88 (SECV 1.26%) and 0.82 (SECV 4.46) for pH, lactic acid, TDN and RFV on a dry matter (%), respectively. Results of this experiment showed the possibility of NIRS method to predict the chemical composition and fermentation quality of Italian ryegrass silages as routine analysis method in feeding value evaluation and for farmer advice.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Study(V) on Development of Charts and Equations Predicting Allowable Compressive Bearing Capacity for Prebored PHC Piles Socketed into Weathered Rock through Sandy Soil Layers - Analysis of Results and Data by Parametric Numerical Analysis - (사질토를 지나 풍화암에 소켓된 매입 PHC말뚝에서 지반의 허용압축지지력 산정도표 및 산정공식 개발에 관한 연속 연구(V) - 매개변수 수치해석 자료 분석 -)

  • Park, Mincheol;Kwon, Oh-Kyun;Kim, Chae Min;Yun, Do Kyun;Choi, Yongkyu
    • Journal of the Korean Geotechnical Society
    • /
    • v.35 no.10
    • /
    • pp.47-66
    • /
    • 2019
  • A parametric numerical analysis according to diameter, length, and N values of soil was conducted for the PHC pile socketed into weathered rock through sandy soil layers. In the numerical analysis, the Mohr-Coulomb model was applied to PHC pile and soils, and the contacted phases among the pile-soil-cement paste were modeled as interfaces with a virtual thickness. The parametric numerical analyses for 10 kinds of pile diameters were executed to obtain the load-settlement relationship and the axial load distribution according to N-values. The load-settlement curves were obtained for each load such as total load, total skin friction, skin friction of the sandy soil layer, skin friction of the weathered rock layer and end bearing resistance of the weathered rock. As a result of analysis of various load levels from the load-settlement curves, the settlements corresponding to the inflection point of each curve were appeared as about 5~7% of each pile diameter and were estimated conservatively as 5% of each pile diameter. The load at the inflection point was defined as the mobilized bearing capacity ($Q_m$) and it was used in analyses of pile bearing capacity. And SRF was appeared above average 70%, irrespective of diameter, embedment length of pile and N value of sandy soil layer. Also, skin frictional resistance of sandy soil layers was evaluated above average 80% of total skin frictional resistance. These results can be used in calculating the bearing capacity of prebored PHC pile, and also be utilized in developing the bearing capacity prediction method and chart for the prebored PHC pile socketed into weathered rock through sandy soil layers.

Analyzing the Characteristics of Atmospheric Stability from Radiosonde Observations in the Southern Coastal Region of the Korean Peninsula during the Summer of 2019 (라디오존데 고층관측자료를 활용한 한반도 남해안 지역의 2019년도 여름철 대기 안정도 특성 분석)

  • Shin, Seungsook;Hwang, Sung-Eun;Lee, Young-Tae;Kim, Byung-Taek;Kim, Ki-Hoon
    • Journal of the Korean earth science society
    • /
    • v.42 no.5
    • /
    • pp.496-503
    • /
    • 2021
  • By analyzing the characteristics of atmospheric stability in the southern coastal region of the Korean Peninsula in the summer of 2019, a quantitative threshold of atmospheric instability indices was derived for predicting rainfall events in the Korean Peninsula. For this analysis, we used data from all of the 243 radiosonde intensive observations recorded at the Boseong Standard Weather Observatory (BSWO) in the summer of 2019. To analyze the atmospheric stability of rain events and mesoscale atmospheric phenomena, convective available potential energy (CAPE) and storm relative helicity (SRH) were calculated and compared. In particular, SRH analysis was divided into four levels based on the depth of the atmosphere (0-1, 0-3, 0-6, and 0-10 km). The rain events were categorized into three cases: that of no rain, that of 12 h before the rain, and that of rain. The results showed that SRH was more suitable than CAPE for the prediction of the rainfall events in Boseong during the summer of 2019, and that the rainfall events occurred when the 0-6 km SRH was 150 m2 s-2 or more, which is the same standard as that for a possible weak tornado. In addition, the results of the atmospheric stability analysis during the Changma, which is the rainy period in the Korean Peninsula during the summer and typhoon seasons, showed that the 0-6 km SRH was larger than the mean value of the 0-10 km SRH, whereas SRH generally increased as the depth of the atmosphere increased. Therefore, it can be said that the 0-6 km SRH was more effective in determining the rainfall events caused by typhoons in Boseong in the summer of 2019.

Analysis of Ice Velocity Variations of Nansen Ice Shelf, East Antarctica, from 2000 to 2017 Using Landsat Multispectral Image Matching (Landsat 다중분광 영상정합을 이용한 동남극 난센 빙붕의 2000-2017년 흐름속도 변화 분석)

  • Han, Hyangsun;Lee, Choon-Ki
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_2
    • /
    • pp.1165-1178
    • /
    • 2018
  • Collapse of an Antarctic ice shelf and its flow velocity changes has the potential to reduce the restraining stress to the seaward flow of the Antarctic Ice Sheet, which can cause sea level rising. In this study, variations in ice velocity from 2000 to 2017 for the Nansen Ice Shelf in East Antarctica that experienced a large-scale collapse in April 2016 were analyzed using Landsat-7 Enhanced Thematic Mapper Plus (ETM+) and Landsat-8 Operational Land Imager (OLI) images. To extract ice velocity, image matching based on orientation correlation was applied to the image pairs of blue, green, red, near-infrared, panchromatic, and the first principal component image of the Landsat multispectral data, from which the results were combined. The Landsat multispectral image matching produced reliable ice velocities for at least 14% wider area on the Nansen Ice Shelf than for the case of using single band (i.e., panchromatic) image matching. The ice velocities derived from the Landsat multispectral image matching have the error of $2.1m\;a^{-1}$ compared to the in situ Global Positioning System (GPS) observation data. The region adjacent to the Drygalski Ice Tongue showed the fastest increase in ice velocity between 2000 and 2017. The ice velocity along the central flow line of the Nansen Ice Shelf was stable before 2010 (${\sim}228m\;a^{-1}$). In 2011-2012, when a rift began to develop near the ice front, the ice flow was accelerated (${\sim}255m\;a^{-1}$) but the velocity was only about 11% faster than 2010. Since 2014, the massive rift had been fully developed, and the ice velocity of the upper region of the rift slightly decreased (${\sim}225m\;a^{-1}$) and stabilized. This means that the development of the rift and the resulting collapse of the ice front had little effect on the ice velocity of the Nansen Ice Shelf.

Characteristics Analysis of Snow Particle Size Distribution in Gangwon Region according to Topography (지형에 따른 강원지역의 강설입자 크기 분포 특성 분석)

  • Bang, Wonbae;Kim, Kwonil;Yeom, Daejin;Cho, Su-jeong;Lee, Choeng-lyong;Lee, Daehyung;Ye, Bo-Young;Lee, GyuWon
    • Journal of the Korean earth science society
    • /
    • v.40 no.3
    • /
    • pp.227-239
    • /
    • 2019
  • Heavy snowfall events frequently occur in the Gangwon province, and the snowfall amount significantly varies in space due to the complex terrain and topographical modulation of precipitation. Understanding the spatial characteristics of heavy snowfall and its prediction is particularly challenging during snowfall events in the easterly winds. The easterly wind produces a significantly different atmospheric condition. Hence, it brings different precipitation characteristics. In this study, we have investigated the microphysical characteristics of snowfall in the windward and leeward sides of the Taebaek mountain range in the easterly condition. The two snowfall events are selected in the easterly, and the snow particles size distributions (SSD) are observed in the four sites (two windward and two leeward sites) by the PARSIVEL distrometers. We compared the characteristic parameters of SSDs that come from leeward sites to that of windward sites. The results show that SSDs of windward sites have a relatively wide distribution with many small snow particles compared to those of leeward sites. This characteristic is clearly shown by the larger characteristic number concentration and characteristic diameter in the windward sites. Snowfall rate and ice water content of windward also are larger than those of leeward sites. The results indicate that a new generation of snowfall particles is dominant in the windward sites which is likely due to the orographic lifting. In addition, the windward sites show heavy aggregation particles by nearby zero ground temperature that is likely driven by the wet and warm condition near the ocean.

Estimation of freeze damage risk according to developmental stage of fruit flower buds in spring (봄철 과수 꽃눈 발육 수준에 따른 저온해 위험도 산정)

  • Kim, Jin-Hee;Kim, Dae-jun;Kim, Soo-ock;Yun, Eun-jeong;Ju, Okjung;Park, Jong Sun;Shin, Yong Soon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.1
    • /
    • pp.55-64
    • /
    • 2019
  • The flowering seasons can be advanced due to climate change that would cause an abnormally warm winter. Such warm winter would increase the frequency of crop damages resulted from sudden occurrences of low temperature before and after the vegetative growth stages, e.g., the period from germination to flowering. The degree and pattern of freezing damage would differ by the development stage of each individual fruit tree even in an orchard. A critical temperature, e.g., killing temperature, has been used to predict freeze damage by low-temperature conditions under the assumption that such damage would be associated with the development stage of a fruit flower bud. However, it would be challenging to apply the critical temperature to a region where spatial variation in temperature would be considerably high. In the present study, a phenological model was used to estimate major bud development stages, which would be useful for prediction of regional risks for the freeze damages. We also derived a linear function to calculate a probabilistic freeze risk in spring, which can quantitatively evaluate the risk level based solely on forecasted weather data. We calculated the dates of freeze damage occurrences and spatial risk distribution according to main production areas by applying the spring freeze risk function to apple, peach, and pear crops in 2018. It was predicted that the most extensive low-temperature associated freeze damage could have occurred on April 8. It was also found that the risk function was useful to identify the main production areas where the greatest damage to a given crop could occur. These results suggest that the freezing damage associated with the occurrence of low-temperature events could decrease providing early warning for growers to respond abnormal weather conditions for their farm.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Kriging of Daily PM10 Concentration from the Air Korea Stations Nationwide and the Accuracy Assessment (베리오그램 최적화 기반의 정규크리깅을 이용한 전국 에어코리아 PM10 자료의 일평균 격자지도화 및 내삽정확도 검증)

  • Jeong, Yemin;Cho, Subin;Youn, Youjeong;Kim, Seoyeon;Kim, Geunah;Kang, Jonggu;Lee, Dalgeun;Chung, Euk;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.379-394
    • /
    • 2021
  • Air pollution data in South Korea is provided on a real-time basis by Air Korea stations since 2005. Previous studies have shown the feasibility of gridding air pollution data, but they were confined to a few cities. This paper examines the creation of nationwide gridded maps for PM10 concentration using 333 Air Korea stations with variogram optimization and ordinary kriging. The accuracy of the spatial interpolation was evaluated by various sampling schemes to avoid a too dense or too sparse distribution of the validation points. Using the 114,745 matchups, a four-round blind test was conducted by extracting random validation points for every 365 days in 2019. The overall accuracy was stably high with the MAE of 5.697 ㎍/m3 and the CC of 0.947. Approximately 1,500 cases for high PM10 concentration also showed a result with the MAE of about 12 ㎍/m3 and the CC over 0.87, which means that the proposed method was effective and applicable to various situations. The gridded maps for daily PM10 concentration at the resolution of 0.05° also showed a reasonable spatial distribution, which can be used as an input variable for a gridded prediction of tomorrow's PM10 concentration.

A Machine Learning-based Total Production Time Prediction Method for Customized-Manufacturing Companies (주문생산 기업을 위한 기계학습 기반 총생산시간 예측 기법)

  • Park, Do-Myung;Choi, HyungRim;Park, Byung-Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.177-190
    • /
    • 2021
  • Due to the development of the fourth industrial revolution technology, efforts are being made to improve areas that humans cannot handle by utilizing artificial intelligence techniques such as machine learning. Although on-demand production companies also want to reduce corporate risks such as delays in delivery by predicting total production time for orders, they are having difficulty predicting this because the total production time is all different for each order. The Theory of Constraints (TOC) theory was developed to find the least efficient areas to increase order throughput and reduce order total cost, but failed to provide a forecast of total production time. Order production varies from order to order due to various customer needs, so the total production time of individual orders can be measured postmortem, but it is difficult to predict in advance. The total measured production time of existing orders is also different, which has limitations that cannot be used as standard time. As a result, experienced managers rely on persimmons rather than on the use of the system, while inexperienced managers use simple management indicators (e.g., 60 days total production time for raw materials, 90 days total production time for steel plates, etc.). Too fast work instructions based on imperfections or indicators cause congestion, which leads to productivity degradation, and too late leads to increased production costs or failure to meet delivery dates due to emergency processing. Failure to meet the deadline will result in compensation for delayed compensation or adversely affect business and collection sectors. In this study, to address these problems, an entity that operates an order production system seeks to find a machine learning model that estimates the total production time of new orders. It uses orders, production, and process performance for materials used for machine learning. We compared and analyzed OLS, GLM Gamma, Extra Trees, and Random Forest algorithms as the best algorithms for estimating total production time and present the results.