• Title/Summary/Keyword: Prediction modeling

Search Result 1,887, Processing Time 0.027 seconds

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Analysis of the Elderly Travel Characteristics and Travel Behavior with Daily Activity Schedules (the Case of Seoul, Korea) (활동 스케줄 분석을 통한 고령자의 통행특성과 통행행태에 관한 연구)

  • Seo, Sang-Eon;Jeong, Jin-Hyeok;Kim, Sun-Gwan
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.5 s.91
    • /
    • pp.89-108
    • /
    • 2006
  • Korea has been entering the ageing society as the population of age over 65 shared over 7% since the year 2000. The ageing society needs to have transportation facility considering elderly people's travel behavior. This study aims to understand the elderly people's travel behavior using recent data in Korea. The activity schedule approach begins with travel outcomes are part of an activitv scheduling decision. For tho?e approach. used discrete choice models (especially. Nested Logit Model) to address the basic modeling problem capturing decision interaction among the many choice dimensions of the immense activity schedule choice set The day activity schedule is viewed as a sot of tours and at-home activity episodes tied togather with overarching day activity pattern using the Seoul Metropolitan Area Transportation Survey data, which was conducted in June, 2002. Decisions about a specific tour in the schedule are conditioned by the choice of day activity pattern. The day activity scheduling model estimated in this study consists of tours interrelated in a day activity pattern. The day activity pattern model represents the basic decision of activity participation and priorities and places each activity in a configuration of tours and at-home episodes. Each pattern alternative is defined by the primary activity of the day, whether the primary activity occurs at home or away, and the type of tour for the primary activity. In travel mode choice of the elderly and non-workers, especially, travel cost was found to be important in understanding interpersonal variations in mode choice behavior though, travel time was found to be less important factor in choosing travel mode. In addition, although, generally, the elderly was likely to choose transit mode, private mode was preferred for the elderly over 75 years old owing to weakened physical health for such things as going up and down of stairs. Therefore. as entering the ageing society, transit mode should be invested heavily in transportation facility Planning tor improving elderly transportation service. Although the model has not yet been validated in before-and-after prediction studies. this study gives strong evidence of its behavioral soundness, current practicality. and potential for improving reliability of transportation Projects superior to those of the best existing systems in Korea.

Distribution Prediction of Korean Clawed Salamander (Onychodactylus koreanus) according to the Climate Change (기후변화에 따른 한국꼬리치레도롱뇽(Onychodactylus koreanus)의 분포 예측에 대한 연구)

  • Lee, Su-Yeon;Choi, Seo-yun;Bae, Yang-Seop;Suh, Jae-Hwa;Jang, Hoan-Jin;Do, Min-Seock
    • Korean Journal of Environment and Ecology
    • /
    • v.35 no.5
    • /
    • pp.480-489
    • /
    • 2021
  • Climate change poses great threats to wildlife populations by decreasing their number and destroying their habitats, jeopardizing biodiversity conservation. Asiatic salamander (Hynobiidae) species are particularly vulnerable to climate change due to their small home range and limited dispersal ability. Thus, this study used one salamander species, the Korean clawed salamander (Onychodactylus koreanus), as a model species and examined their habitat characteristics and current distribution in South Korea to predict its spatial distribution under climate change. As a result, we found that altitude was the most important environmental factor for their spatial distribution and that they showed a dense distribution in high-altitude forest regions such as Gangwon and Gyeongsanbuk provinces. The spatial distribution range and habitat characteristics predicted in the species distribution models were sufficiently in accordance with previous studies on the species. By modeling their distribution changes under two different climate change scenarios, we predicted that the distribution range of the Korean clawed salamander population would decrease by 62.96% under the RCP4.5 scenario and by 98.52% under the RCP8.5 scenario, indicating a sharp reduction due to climate change. The model's AUC value was the highest in the present (0.837), followed by RCP4.5 (0.832) and RCP8.5 (0.807). Our study provides a basic reference for implementing conservation plans for amphibians under climate change. Additional research using various analysis techniques reflecting habitat characteristics and minute habitat factors for the whole life cycle of Korean-tailed salamanders help identify major environmental factors that affect species reduction.

Characteristics Analysis of Snow Particle Size Distribution in Gangwon Region according to Topography (지형에 따른 강원지역의 강설입자 크기 분포 특성 분석)

  • Bang, Wonbae;Kim, Kwonil;Yeom, Daejin;Cho, Su-jeong;Lee, Choeng-lyong;Lee, Daehyung;Ye, Bo-Young;Lee, GyuWon
    • Journal of the Korean earth science society
    • /
    • v.40 no.3
    • /
    • pp.227-239
    • /
    • 2019
  • Heavy snowfall events frequently occur in the Gangwon province, and the snowfall amount significantly varies in space due to the complex terrain and topographical modulation of precipitation. Understanding the spatial characteristics of heavy snowfall and its prediction is particularly challenging during snowfall events in the easterly winds. The easterly wind produces a significantly different atmospheric condition. Hence, it brings different precipitation characteristics. In this study, we have investigated the microphysical characteristics of snowfall in the windward and leeward sides of the Taebaek mountain range in the easterly condition. The two snowfall events are selected in the easterly, and the snow particles size distributions (SSD) are observed in the four sites (two windward and two leeward sites) by the PARSIVEL distrometers. We compared the characteristic parameters of SSDs that come from leeward sites to that of windward sites. The results show that SSDs of windward sites have a relatively wide distribution with many small snow particles compared to those of leeward sites. This characteristic is clearly shown by the larger characteristic number concentration and characteristic diameter in the windward sites. Snowfall rate and ice water content of windward also are larger than those of leeward sites. The results indicate that a new generation of snowfall particles is dominant in the windward sites which is likely due to the orographic lifting. In addition, the windward sites show heavy aggregation particles by nearby zero ground temperature that is likely driven by the wet and warm condition near the ocean.

Predicting Forest Gross Primary Production Using Machine Learning Algorithms (머신러닝 기법의 산림 총일차생산성 예측 모델 비교)

  • Lee, Bora;Jang, Keunchang;Kim, Eunsook;Kang, Minseok;Chun, Jung-Hwa;Lim, Jong-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.1
    • /
    • pp.29-41
    • /
    • 2019
  • Terrestrial Gross Primary Production (GPP) is the largest global carbon flux, and forest ecosystems are important because of the ability to store much more significant amounts of carbon than other terrestrial ecosystems. There have been several attempts to estimate GPP using mechanism-based models. However, mechanism-based models including biological, chemical, and physical processes are limited due to a lack of flexibility in predicting non-stationary ecological processes, which are caused by a local and global change. Instead mechanism-free methods are strongly recommended to estimate nonlinear dynamics that occur in nature like GPP. Therefore, we used the mechanism-free machine learning techniques to estimate the daily GPP. In this study, support vector machine (SVM), random forest (RF) and artificial neural network (ANN) were used and compared with the traditional multiple linear regression model (LM). MODIS products and meteorological parameters from eddy covariance data were employed to train the machine learning and LM models from 2006 to 2013. GPP prediction models were compared with daily GPP from eddy covariance measurement in a deciduous forest in South Korea in 2014 and 2015. Statistical analysis including correlation coefficient (R), root mean square error (RMSE) and mean squared error (MSE) were used to evaluate the performance of models. In general, the models from machine-learning algorithms (R = 0.85 - 0.93, MSE = 1.00 - 2.05, p < 0.001) showed better performance than linear regression model (R = 0.82 - 0.92, MSE = 1.24 - 2.45, p < 0.001). These results provide insight into high predictability and the possibility of expansion through the use of the mechanism-free machine-learning models and remote sensing for predicting non-stationary ecological processes such as seasonal GPP.

Prediction of Distribution Changes of Carpinus laxiflora and C. tschonoskii Based on Climate Change Scenarios Using MaxEnt Model (MaxEnt 모델링을 이용한 기후변화 시나리오에 따른 서어나무 (Carpinus laxiflora)와 개서어나무 (C. tschonoskii)의 분포변화 예측)

  • Lee, Min-Ki;Chun, Jung-Hwa;Lee, Chang-Bae
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.55-67
    • /
    • 2021
  • Hornbeams (Carpinus spp.), which are widely distributed in South Korea, are recognized as one of the most abundant species at climax stage in the temperate forests. Although the distribution and vegetation structure of the C. laxiflora community have been reported, little ecological information of C. tschonoskii is available. Little effort was made to examine the distribution shift of these species under the future climate conditions. This study was conducted to predict potential shifts in the distribution of C. laxiflora and C. tschonoskii in 2050s and 2090s under the two sets of climate change scenarios, RCP4.5 and RCP8.5. The MaxEnt model was used to predict the spatial distribution of two species using the occurrence data derived from the 6th National Forest Inventory data as well as climate and topography data. It was found that the main factors for the distribution of C. laxiflora were elevation, temperature seasonality, and mean annual precipitation. The distribution of C. tschonoskii, was influenced by temperature seasonality, mean annual precipitation, and mean diurnal rang. It was projected that the total habitat area of the C. laxiflora could increase by 1.05% and 1.11% under RCP 4.5 and RCP 8.5 scenarios, respectively. It was also predicted that the distributional area of C. tschonoskii could expand under the future climate conditions. These results highlighted that the climate change would have considerable impact on the spatial distribution of C. laxiflora and C. tschonoskii. These also suggested that ecological information derived from climate change impact assessment study can be used to develop proper forest management practices in response to climate change.

MDP(Markov Decision Process) Model for Prediction of Survivor Behavior based on Topographic Information (지형정보 기반 조난자 행동예측을 위한 마코프 의사결정과정 모형)

  • Jinho Son;Suhwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • In the wartime, aircraft carrying out a mission to strike the enemy deep in the depth are exposed to the risk of being shoot down. As a key combat force in mordern warfare, it takes a lot of time, effot and national budget to train military flight personnel who operate high-tech weapon systems. Therefore, this study studied the path problem of predicting the route of emergency escape from enemy territory to the target point to avoid obstacles, and through this, the possibility of safe recovery of emergency escape military flight personnel was increased. based problem, transforming the problem into a TSP, VRP, and Dijkstra algorithm, and approaching it with an optimization technique. However, if this problem is approached in a network problem, it is difficult to reflect the dynamic factors and uncertainties of the battlefield environment that military flight personnel in distress will face. So, MDP suitable for modeling dynamic environments was applied and studied. In addition, GIS was used to obtain topographic information data, and in the process of designing the reward structure of MDP, topographic information was reflected in more detail so that the model could be more realistic than previous studies. In this study, value iteration algorithms and deterministic methods were used to derive a path that allows the military flight personnel in distress to move to the shortest distance while making the most of the topographical advantages. In addition, it was intended to add the reality of the model by adding actual topographic information and obstacles that the military flight personnel in distress can meet in the process of escape and escape. Through this, it was possible to predict through which route the military flight personnel would escape and escape in the actual situation. The model presented in this study can be applied to various operational situations through redesign of the reward structure. In actual situations, decision support based on scientific techniques that reflect various factors in predicting the escape route of the military flight personnel in distress and conducting combat search and rescue operations will be possible.