• Title/Summary/Keyword: K-Nearest Neighbor 알고리즘

Search Result 204, Processing Time 0.019 seconds

Water Balance Projection Using Climate Change Scenarios in the Korean Peninsula (기후변화 시나리오를 활용한 미래 한반도 물수급 전망)

  • Kim, Cho-Rong;Kim, Young-Oh;Seo, Seung Beom;Choi, Su-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.8
    • /
    • pp.807-819
    • /
    • 2013
  • This study proposes a new methodology for future water balance projection considering climate change by assigning a weight to each scenario instead of inputting future streamflows based on GCMs into a water balance model directly. K-nearest neighbor algorithm was employed to assign weights and streamflows in non-flood period (October to the following June) was selected as the criterion for assigning weights. GCM-driven precipitation was input to TANK model to simulate future streamflow scenarios and Quantile Mapping was applied to correct bias between GCM hindcast and historical data. Based on these bias-corrected streamflows, different weights were assigned to each streamflow scenarios to calculate water shortage for the projection periods; 2020s (2010~2039), 2050s (2040~2069), and 2080s (2070~2099). As a result by applying the proposed methodology to project water shortage over the Korean Peninsula, average water shortage for 2020s is projected to increase to 10~32% comparing to the basis (1967~2003). In addition, according to getting decreased in streamflows in non-flood period gradually by 2080s, average water shortage for 2080s is projected to increase up to 97% (516.5 million $m^3/yr$) as maximum comparing to the basis. While the existing research on climate change gives radical increase in future water shortage, the results projected by the weighting method shows conservative change. This study has significance in the applicability of water balance projection regarding climate change, keeping the existing framework of national water resources planning and this lessens the confusion for decision-makers in water sectors.

Response Modeling for the Marketing Promotion with Weighted Case Based Reasoning Under Imbalanced Data Distribution (불균형 데이터 환경에서 변수가중치를 적용한 사례기반추론 기반의 고객반응 예측)

  • Kim, Eunmi;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.29-45
    • /
    • 2015
  • Response modeling is a well-known research issue for those who have tried to get more superior performance in the capability of predicting the customers' response for the marketing promotion. The response model for customers would reduce the marketing cost by identifying prospective customers from very large customer database and predicting the purchasing intention of the selected customers while the promotion which is derived from an undifferentiated marketing strategy results in unnecessary cost. In addition, the big data environment has accelerated developing the response model with data mining techniques such as CBR, neural networks and support vector machines. And CBR is one of the most major tools in business because it is known as simple and robust to apply to the response model. However, CBR is an attractive data mining technique for data mining applications in business even though it hasn't shown high performance compared to other machine learning techniques. Thus many studies have tried to improve CBR and utilized in business data mining with the enhanced algorithms or the support of other techniques such as genetic algorithm, decision tree and AHP (Analytic Process Hierarchy). Ahn and Kim(2008) utilized logit, neural networks, CBR to predict that which customers would purchase the items promoted by marketing department and tried to optimized the number of k for k-nearest neighbor with genetic algorithm for the purpose of improving the performance of the integrated model. Hong and Park(2009) noted that the integrated approach with CBR for logit, neural networks, and Support Vector Machine (SVM) showed more improved prediction ability for response of customers to marketing promotion than each data mining models such as logit, neural networks, and SVM. This paper presented an approach to predict customers' response of marketing promotion with Case Based Reasoning. The proposed model was developed by applying different weights to each feature. We deployed logit model with a database including the promotion and the purchasing data of bath soap. After that, the coefficients were used to give different weights of CBR. We analyzed the performance of proposed weighted CBR based model compared to neural networks and pure CBR based model empirically and found that the proposed weighted CBR based model showed more superior performance than pure CBR model. Imbalanced data is a common problem to build data mining model to classify a class with real data such as bankruptcy prediction, intrusion detection, fraud detection, churn management, and response modeling. Imbalanced data means that the number of instance in one class is remarkably small or large compared to the number of instance in other classes. The classification model such as response modeling has a lot of trouble to recognize the pattern from data through learning because the model tends to ignore a small number of classes while classifying a large number of classes correctly. To resolve the problem caused from imbalanced data distribution, sampling method is one of the most representative approach. The sampling method could be categorized to under sampling and over sampling. However, CBR is not sensitive to data distribution because it doesn't learn from data unlike machine learning algorithm. In this study, we investigated the robustness of our proposed model while changing the ratio of response customers and nonresponse customers to the promotion program because the response customers for the suggested promotion is always a small part of nonresponse customers in the real world. We simulated the proposed model 100 times to validate the robustness with different ratio of response customers to response customers under the imbalanced data distribution. Finally, we found that our proposed CBR based model showed superior performance than compared models under the imbalanced data sets. Our study is expected to improve the performance of response model for the promotion program with CBR under imbalanced data distribution in the real world.

Rainfall image DB construction for rainfall intensity estimation from CCTV videos: focusing on experimental data in a climatic environment chamber (CCTV 영상 기반 강우강도 산정을 위한 실환경 실험 자료 중심 적정 강우 이미지 DB 구축 방법론 개발)

  • Byun, Jongyun;Jun, Changhyun;Kim, Hyeon-Joon;Lee, Jae Joon;Park, Hunil;Lee, Jinwook
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.6
    • /
    • pp.403-417
    • /
    • 2023
  • In this research, a methodology was developed for constructing an appropriate rainfall image database for estimating rainfall intensity based on CCTV video. The database was constructed in the Large-Scale Climate Environment Chamber of the Korea Conformity Laboratories, which can control variables with high irregularity and variability in real environments. 1,728 scenarios were designed under five different experimental conditions. 36 scenarios and a total of 97,200 frames were selected. Rain streaks were extracted using the k-nearest neighbor algorithm by calculating the difference between each image and the background. To prevent overfitting, data with pixel values greater than set threshold, compared to the average pixel value for each image, were selected. The area with maximum pixel variability was determined by shifting with every 10 pixels and set as a representative area (180×180) for the original image. After re-transforming to 120×120 size as an input data for convolutional neural networks model, image augmentation was progressed under unified shooting conditions. 92% of the data showed within the 10% absolute range of PBIAS. It is clear that the final results in this study have the potential to enhance the accuracy and efficacy of existing real-world CCTV systems with transfer learning.

Incorporating Social Relationship discovered from User's Behavior into Collaborative Filtering (사용자 행동 기반의 사회적 관계를 결합한 사용자 협업적 여과 방법)

  • Thay, Setha;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.1-20
    • /
    • 2013
  • Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.