• Title/Summary/Keyword: Time prediction

Search Result 5,838, Processing Time 0.033 seconds

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Analysis of the Elderly Travel Characteristics and Travel Behavior with Daily Activity Schedules (the Case of Seoul, Korea) (활동 스케줄 분석을 통한 고령자의 통행특성과 통행행태에 관한 연구)

  • Seo, Sang-Eon;Jeong, Jin-Hyeok;Kim, Sun-Gwan
    • Journal of Korean Society of Transportation
    • /
    • v.24 no.5 s.91
    • /
    • pp.89-108
    • /
    • 2006
  • Korea has been entering the ageing society as the population of age over 65 shared over 7% since the year 2000. The ageing society needs to have transportation facility considering elderly people's travel behavior. This study aims to understand the elderly people's travel behavior using recent data in Korea. The activity schedule approach begins with travel outcomes are part of an activitv scheduling decision. For tho?e approach. used discrete choice models (especially. Nested Logit Model) to address the basic modeling problem capturing decision interaction among the many choice dimensions of the immense activity schedule choice set The day activity schedule is viewed as a sot of tours and at-home activity episodes tied togather with overarching day activity pattern using the Seoul Metropolitan Area Transportation Survey data, which was conducted in June, 2002. Decisions about a specific tour in the schedule are conditioned by the choice of day activity pattern. The day activity scheduling model estimated in this study consists of tours interrelated in a day activity pattern. The day activity pattern model represents the basic decision of activity participation and priorities and places each activity in a configuration of tours and at-home episodes. Each pattern alternative is defined by the primary activity of the day, whether the primary activity occurs at home or away, and the type of tour for the primary activity. In travel mode choice of the elderly and non-workers, especially, travel cost was found to be important in understanding interpersonal variations in mode choice behavior though, travel time was found to be less important factor in choosing travel mode. In addition, although, generally, the elderly was likely to choose transit mode, private mode was preferred for the elderly over 75 years old owing to weakened physical health for such things as going up and down of stairs. Therefore. as entering the ageing society, transit mode should be invested heavily in transportation facility Planning tor improving elderly transportation service. Although the model has not yet been validated in before-and-after prediction studies. this study gives strong evidence of its behavioral soundness, current practicality. and potential for improving reliability of transportation Projects superior to those of the best existing systems in Korea.

Design of accelerated life test on temperature stress of piezoelectric sensor for monitoring high-level nuclear waste repository (고준위방사성폐기물 처분장 모니터링용 피에조센서의 온도 스트레스에 관한 가속수명시험 설계)

  • Hwang, Hyun-Joong;Park, Changhee;Hong, Chang-Ho;Kim, Jin-Seop;Cho, Gye-Chun
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.24 no.6
    • /
    • pp.451-464
    • /
    • 2022
  • The high-level nuclear waste repository is a deep geological disposal system exposed to complex environmental conditions such as high temperature, radiation, and ground-water due to handling spent nuclear fuel. Continuous exposure can lead to cracking and deterioration of the structure over time. On the other hand, the high-level nuclear waste repository requires an ultra-long life expectancy. Thus long-term structural health monitoring is essential. Various sensors such as an accelerometer, earth pressure gauge, and displacement meter can be used to monitor the health of a structure, and a piezoelectric sensor is generally used. Therefore, it is necessary to develop a highly durable sensor based on the durability assessment of the piezoelectric sensor. This study designed an accelerated life test for durability assessment and life prediction of the piezoelectric sensor. Based on the literature review, the number of accelerated stress levels for a single stress factor, and the number of samples for each level were selected. The failure mode and mechanism of the piezoelectric sensor that can occur in the environmental conditions of the high-level waste repository were analyzed. In addition, two methods were proposed to investigate the maximum harsh condition for the temperature stress factor. The reliable operating limit of the piezoelectric sensor was derived, and a reasonable accelerated stress level was set for the accelerated life test. The suggested methods contain economical and practical ideas and can be widely used in designing accelerated life tests of piezoelectric sensors.

Tracing the Drift Ice Using the Particle Tracking Method in the Arctic Ocean (북극해에서 입자추적 방법을 이용한 유빙 추적 연구)

  • Park, GwangSeob;Kim, Hyun-Cheol;Lee, Taehee;Son, Young Baek
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_2
    • /
    • pp.1299-1310
    • /
    • 2018
  • In this study, we analyzed distribution and movement trends using in-situ observations and particle tracking methods to understand the movement of the drift ice in the Arctic Ocean. The in-situ movement data of the drift ice in the Arctic Ocean used ITP (Ice-Tethered Profiler) provided by NOAA (National Oceanic and Atmospheric Administration) from 2009 to 2018, which was analyzed with the location and speed for each year. Particle tracking simulates the movement of the drift ice using daily current and wind data provided by HYCOM (Hybrid Coordinate Ocean Model) and ECMWF (European Centre for Medium-Range Weather Forecasts, 2009-2017). In order to simulate the movement of the drift ice throughout the Arctic Ocean, ITP data, a field observation data, were used as input to calculate the relationship between the current and wind and follow up the Lagrangian particle tracking. Particle tracking simulations were conducted with two experiments taking into account the effects of current and the combined effects of current and wind, most of which were reproduced in the same way as in-situ observations, given the effects of currents and winds. The movement of the drift ice in the Arctic Ocean was reproduced using a wind-imposed equation, which analyzed the movement of the drift ice in a particular year. In 2010, the Arctic Ocean Index (AOI) was a negative year, with particles clearly moving along the Beaufort Gyre, resulting in relatively large movements in Beaufort Sea. On the other hand, in 2017 AOI was a positive year, with most particles not affected by Gyre, resulting in relatively low speed and distance. Around the pole, the speed of the drift ice is lower in 2017 than 2010. From seasonal characteristics in 2010 and 2017, the movement of the drift ice increase in winter 2010 (0.22 m/s) and decrease to spring 2010 (0.16 m/s). In the case of 2017, the movement is increased in summer (0.22 m/s) and decreased to spring time (0.13 m/s). As a result, the particle tracking method will be appropriate to understand long-term drift ice movement trends by linking them with satellite data in place of limited field observations.

Study on Standardization of the Environmental Impact Evaluation Method of Extremely Low Frequency Magnetic Fields near High Voltage Overhead Transmission Lines (고압 가공송전선로의 극저주파자기장 환경영향평가 방법 표준화에 관한 연구)

  • Park, Sung-Ae;Jung, Joonsig;Choi, Taebong;Jeong, Minjoo;Kim, Bu-Kyung;Lee, Jongchun
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.6
    • /
    • pp.658-673
    • /
    • 2018
  • Social conflicts with extremely low frequency magnetic field(ELF-MF) exposures are expected to exacerbate due to continued increase in electric power demand and construction of high voltage transmission lines(HVTL). However, in current environmental impact assessment(EIA) act, specific guidelines have not been included concretely about EIA of ELF-MF. Therefore, this study conducted a standardization study on EIA method through case analysis, field measurement, and expert consultation of the EIA for the ELF-MF near HVTL which is the main cause of exposures. The status of the EIA of the ELF-MF and the problem to be improved are derived and the EIA method which can solve it is suggested. The main contents of the study is that the physical characteristics of the ELF-MF affected by distance and powerload should be considered at all stages of EIA(survey of the current situation - Prediction of the impacts - preparation of mitigation plan ? post EIA planning). Based on this study, we also suggested the 'Measurement method for extremely low frequency magnetic field on transmission line' and 'Table for extremely low frequency magnetic field measurement record on transmission line'. The results of this study can be applied to the EIA that minimizes the damage and conflict to the construction of transmission line and derives rational measures at the present time when the human hazard to long term exposure of the ELF-MF is unclear.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Kriging of Daily PM10 Concentration from the Air Korea Stations Nationwide and the Accuracy Assessment (베리오그램 최적화 기반의 정규크리깅을 이용한 전국 에어코리아 PM10 자료의 일평균 격자지도화 및 내삽정확도 검증)

  • Jeong, Yemin;Cho, Subin;Youn, Youjeong;Kim, Seoyeon;Kim, Geunah;Kang, Jonggu;Lee, Dalgeun;Chung, Euk;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.379-394
    • /
    • 2021
  • Air pollution data in South Korea is provided on a real-time basis by Air Korea stations since 2005. Previous studies have shown the feasibility of gridding air pollution data, but they were confined to a few cities. This paper examines the creation of nationwide gridded maps for PM10 concentration using 333 Air Korea stations with variogram optimization and ordinary kriging. The accuracy of the spatial interpolation was evaluated by various sampling schemes to avoid a too dense or too sparse distribution of the validation points. Using the 114,745 matchups, a four-round blind test was conducted by extracting random validation points for every 365 days in 2019. The overall accuracy was stably high with the MAE of 5.697 ㎍/m3 and the CC of 0.947. Approximately 1,500 cases for high PM10 concentration also showed a result with the MAE of about 12 ㎍/m3 and the CC over 0.87, which means that the proposed method was effective and applicable to various situations. The gridded maps for daily PM10 concentration at the resolution of 0.05° also showed a reasonable spatial distribution, which can be used as an input variable for a gridded prediction of tomorrow's PM10 concentration.

A Prediction of N-value Using Artificial Neural Network (인공신경망을 이용한 N치 예측)

  • Kim, Kwang Myung;Park, Hyoung June;Goo, Tae Hun;Kim, Hyung Chan
    • The Journal of Engineering Geology
    • /
    • v.30 no.4
    • /
    • pp.457-468
    • /
    • 2020
  • Problems arising during pile design works for plant construction, civil and architecture work are mostly come from uncertainty of geotechnical characteristics. In particular, obtaining the N-value measured through the Standard Penetration Test (SPT) is the most important data. However, it is difficult to obtain N-value by drilling investigation throughout the all target area. There are many constraints such as licensing, time, cost, equipment access and residential complaints etc. it is impossible to obtain geotechnical characteristics through drilling investigation within a short bidding period in overseas. The geotechnical characteristics at non-drilling investigation points are usually determined by the engineer's empirical judgment, which can leads to errors in pile design and quantity calculation causing construction delay and cost increase. It would be possible to overcome this problem if N-value could be predicted at the non-drilling investigation points using limited minimum drilling investigation data. This study was conducted to predicted the N-value using an Artificial Neural Network (ANN) which one of the Artificial intelligence (AI) method. An Artificial Neural Network treats a limited amount of geotechnical characteristics as a biological logic process, providing more reliable results for input variables. The purpose of this study is to predict N-value at the non-drilling investigation points through patterns which is studied by multi-layer perceptron and error back-propagation algorithms using the minimum geotechnical data. It has been reviewed the reliability of the values that predicted by AI method compared to the measured values, and we were able to confirm the high reliability as a result. To solving geotechnical uncertainty, we will perform sensitivity analysis of input variables to increase learning effect in next steps and it may need some technical update of program. We hope that our study will be helpful to design works in the future.

Extraction of Water Body Area using Micro Satellite SAR: A Case Study of the Daecheng Dam of South korea (초소형 SAR 위성을 활용한 수체면적 추출: 대청댐 유역 대상)

  • PARK, Jongsoo;KANG, Ki-Mook;HWANG, Eui-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.4
    • /
    • pp.41-54
    • /
    • 2021
  • It is very essential to estimate the water body area using remote exploration for water resource management, analysis and prediction of water disaster damage. Hydrophysical detection using satellites has been mainly performed on large satellites equipped with optical and SAR sensors. However, due to the long repeat cycle, there is a limitation that timely utilization is impossible in the event of a disaster/disaster. With the recent active development of Micro satellites, it has served as an opportunity to overcome the limitations of time resolution centered on existing large satellites. The Micro satellites currently in active operation are ICEYE in Finland and Capella satellites in the United States, and are operated in the form of clusters for earth observation purposes. Due to clustering operation, it has a short revisit cycle and high resolution and has the advantage of being able to observe regardless of weather or day and night with the SAR sensor mounted. In this study, the operation status and characteristics of micro satellites were described, and the water area estimation technology optimized for micro SAR satellite images was applied to the Daecheong Dam basin on the Korean Peninsula. In addition, accuracy verification was performed based on the reference value of the water generated from the optical satellite Sentinel-2 satellite as a reference. In the case of the Capella satellite, the smallest difference in area was shown, and it was confirmed that all three images showed high correlation. Through the results of this study, it was confirmed that despite the low NESZ of Micro satellites, it is possible to estimate the water area, and it is believed that the limitations of water resource/water disaster monitoring using existing large SAR satellites can be overcome.

Prediction of patent lifespan and analysis of influencing factors using machine learning (기계학습을 활용한 특허수명 예측 및 영향요인 분석)

  • Kim, Yongwoo;Kim, Min Gu;Kim, Young-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.147-170
    • /
    • 2022
  • Although the number of patent which is one of the core outputs of technological innovation continues to increase, the number of low-value patents also hugely increased. Therefore, efficient evaluation of patents has become important. Estimation of patent lifespan which represents private value of a patent, has been studied for a long time, but in most cases it relied on a linear model. Even if machine learning methods were used, interpretation or explanation of the relationship between explanatory variables and patent lifespan was insufficient. In this study, patent lifespan (number of renewals) is predicted based on the idea that patent lifespan represents the value of the patent. For the research, 4,033,414 patents applied between 1996 and 2017 and finally granted were collected from USPTO (US Patent and Trademark Office). To predict the patent lifespan, we use variables that can reflect the characteristics of the patent, the patent owner's characteristics, and the inventor's characteristics. We build four different models (Ridge Regression, Random Forest, Feed Forward Neural Network, Gradient Boosting Models) and perform hyperparameter tuning through 5-fold Cross Validation. Then, the performance of the generated models are evaluated, and the relative importance of predictors is also presented. In addition, based on the Gradient Boosting Model which have excellent performance, Accumulated Local Effects Plot is presented to visualize the relationship between predictors and patent lifespan. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the evaluation reason of individual patents, and discuss applicability to the patent evaluation system. This study has academic significance in that it cumulatively contributes to the existing patent life estimation research and supplements the limitations of existing patent life estimation studies based on linearity. It is academically meaningful that this study contributes cumulatively to the existing studies which estimate patent lifespan, and that it supplements the limitations of linear models. Also, it is practically meaningful to suggest a method for deriving the evaluation basis for individual patent value and examine the applicability to patent evaluation systems.