• Title/Summary/Keyword: 군집 수 최적화

Search Result 128, Processing Time 0.023 seconds

The Adaptive Personalization Method According to Users Purchasing Index : Application to Beverage Purchasing Predictions (고객별 구매빈도에 동적으로 적응하는 개인화 시스템 : 음료수 구매 예측에의 적용)

  • Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.95-108
    • /
    • 2011
  • TThis is a study of the personalization method that intelligently adapts the level of clustering considering purchasing index of a customer. In the e-biz era, many companies gather customers' demographic and transactional information such as age, gender, purchasing date and product category. They use this information to predict customer's preferences or purchasing patterns so that they can provide more customized services to their customers. The previous Customer-Segmentation method provides customized services for each customer group. This method clusters a whole customer set into different groups based on their similarity and builds predictive models for the resulting groups. Thus, it can manage the number of predictive models and also provide more data for the customers who do not have enough data to build a good predictive model by using the data of other similar customers. However, this method often fails to provide highly personalized services to each customer, which is especially important to VIP customers. Furthermore, it clusters the customers who already have a considerable amount of data as well as the customers who only have small amount of data, which causes to increase computational cost unnecessarily without significant performance improvement. The other conventional method called 1-to-1 method provides more customized services than the Customer-Segmentation method for each individual customer since the predictive model are built using only the data for the individual customer. This method not only provides highly personalized services but also builds a relatively simple and less costly model that satisfies with each customer. However, the 1-to-1 method has a limitation that it does not produce a good predictive model when a customer has only a few numbers of data. In other words, if a customer has insufficient number of transactional data then the performance rate of this method deteriorate. In order to overcome the limitations of these two conventional methods, we suggested the new method called Intelligent Customer Segmentation method that provides adaptive personalized services according to the customer's purchasing index. The suggested method clusters customers according to their purchasing index, so that the prediction for the less purchasing customers are based on the data in more intensively clustered groups, and for the VIP customers, who already have a considerable amount of data, clustered to a much lesser extent or not clustered at all. The main idea of this method is that applying clustering technique when the number of transactional data of the target customer is less than the predefined criterion data size. In order to find this criterion number, we suggest the algorithm called sliding window correlation analysis in this study. The algorithm purposes to find the transactional data size that the performance of the 1-to-1 method is radically decreased due to the data sparity. After finding this criterion data size, we apply the conventional 1-to-1 method for the customers who have more data than the criterion and apply clustering technique who have less than this amount until they can use at least the predefined criterion amount of data for model building processes. We apply the two conventional methods and the newly suggested method to Neilsen's beverage purchasing data to predict the purchasing amounts of the customers and the purchasing categories. We use two data mining techniques (Support Vector Machine and Linear Regression) and two types of performance measures (MAE and RMSE) in order to predict two dependent variables as aforementioned. The results show that the suggested Intelligent Customer Segmentation method can outperform the conventional 1-to-1 method in many cases and produces the same level of performances compare with the Customer-Segmentation method spending much less computational cost.

Development of weekly rainfall-runoff model for drought outlooks (가뭄전망을 위한 주간 강우-유출 모형의 개발 및 적용)

  • Kang, Shinuk;Chun, Gunil;Nam, Woosung;Park, Jinhyeog
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.214-214
    • /
    • 2019
  • 가뭄이 '심함' 단계 이상 도달 시에는 매주 수문분석을 수행하여 가뭄전망을 수행하여야 한다. 이를 위해서는 기상청의 강수량과 기온 등의 기상예측 자료가 필요하다. 현재 기상청에서는 3개월 기상전망으로 월단위 강수량과 평균기온을 매월 제공하고 있다. 1개월 전망에서 4주의 강수량합과 평균기온을 제공하고 있다. 하지만, 향후 4주간을 전망하는 1개월 전망에서는 1주단위의 강수량과 평균기온이 아닌, 4주간의 강수량합과 평균기온을 1주일 단위로 업데이트해 WINS에 제공하고 있다. 1주단위의 강수량과 평균기온을 취득하기 어려워, 평년 일단위 강수량과 평균기온 자료를 사용하여 4주간의 자료를 1주 단위로 분할하는 방법을 사용하였다. 주간단위 수문자료의 처리를 위해 국제표준기구(ISO)에서 제시하는 기준(ISO 8601)에 따랐다. ISO 8601은 월요일부터 일요일까지를 1주로 정의하며 현재 사용하고 있는 날짜체계와 1대1로 대응되도록 하였다. 예를 들면 1981년 2월 22일은 '1981-W07-7' 또는 '1981W077'로 표시한다. 표시된 형식은 1981년 7번째 주 일요일을 뜻한다. 이 기준에 따라 수문자료를 정리할 수 있도록 프로그램을 개발하였다. 주간 단위 잠재증발산량 계산은 월잠재증발산량 프로그램을 1주단위로 계산할 수 있도록 수정 및 보완하여 개발하였다. 수정 및 보완한 부분은 외기복사(外氣輻射)량 계산부분이다. 외기복사량은 지구가 태양을 1년 주기로 공전하므로 특정 위도에서 특정날짜에 따라 복사량이 달라지므로 주간단위의 월요일부터 일요일에 해당하는 날짜의 외기복사량을 각각 계산하고 이를 평균하여 주간단위 대푯값으로 사용하도록 하였다. 계산된 주간단위 외기복사량과 최고 최저기온을 입력하여 Hargreaves식에 의해 잠재증발산량을 계산한다. 융적설을 포함한 주단위 강우-유출 모형의 매개변수를 추정하기 위해 전국 24개 지점의 수문자료를 사용하였다. abcd 모형과 융적설모듈의 초기값 포함 11개 매개변수를 SCE-UA 전역최적화 알고리즘으로 추정하였다. 추정된 유역의 매개변수는 토양배수, 토양심도, 수문지질, 유역특성인자를 사용한 군집분석 결과에 의해 113개 중권역에 할당하였다. 개발된 주간단위 강우-유출 모형은 비교적 단기 가뭄전망을 위해 사용된다. 계산된 유량은 자연유량이며, 전국 취수장 수량, 하수처리장 방류수, 회귀수를 반영하여 지점별 유량을 계산하여 가뭄전망에 사용되고 있다.

  • PDF

Pollen analysis from Osong Archaeological Site, Chungbuk Province: Vegetation and Environmental Implication (충북 청주시 오송지구 유적 발굴지의 화분분석: 색생과 퇴적환경 고찰)

  • Yi, Sang-Heon;Kim, Ju-Yong
    • The Korean Journal of Quaternary Research
    • /
    • v.24 no.1
    • /
    • pp.25-33
    • /
    • 2010
  • Holocene vegetation and climate changes were assumed on the basis of pollen records from Wonpyeong Trench II-3 of the Osong archaeological site, Cheongju, Chungbuk Province, Korea. An organic matter beared in coarse sediments appeared to be low throughout the succession. Although an occurrence of pollen grains is not high, some dominant and principal taxa may indicate vegetation changes response to climate changes in central inland area of the Korean Peninsula. The age determination can be estimated with indirect way by comparing with previous age-controlled pollen studies. It is assumed that the former last glacial conifer forests had been changed into open mixed conifers and deciduous broadleaved forest during the early Holocene period. Warmer and more humid climate conditions, during the mid-Holocene, might have allowed the hardwoods including deciduous- and evergreen-broadleaved trees, and warm-preferring pine tree to flourish. Subsequently, the former forests were replaced by mixed of conifer and deciduous broadleaved forest owing to deterioration of climate conditions during the late Holocene. Human activity is also detected by agricultural indicators, such as buckwheat and large pollen grains comparable to corn, in upper most pollen profile. During this time, the forests in studied area were primarily affected by human disturbance rather than natural environment.

  • PDF

Development of Land Surface Model for Soyang river basin (소양강댐 유역에 대한 지표수문모형의 구축)

  • Lee, Jaehyeon;Cho, Huidae;Choi, Minha;Kim, Dongkyun
    • Journal of Korea Water Resources Association
    • /
    • v.50 no.12
    • /
    • pp.837-847
    • /
    • 2017
  • Land Surface Model (LSM) was developed for the Soyang river basin located in Korean Peninsula to clarify the spatio-temporal variability of hydrological weather parameters. Variable Infiltration Capacity (VIC) model was used as a LSM. The spatial resolution of the model was 10 km and the time resolution was 1 day. Based on the daily flow data from 2007 to 2010, the 7 parameters of the model were calibrated using the Isolated Particle Swarm Optimization algorithm and the model was verified using the daily flow data from 2011 to 2014. The model showed a Nash-Sutcliffe Coefficient of 0.90 and a correlation coefficient of 0.95 for both calibration and validation periods. The hydrometeorological variables estimated for the Soyang river basin reflected well the seasonal characteristics of summer rainfall concentration, the change of short and shortwave radiation due to temperature change, the change of surface temperature, the evaporation and vegetation increase in the cover layer, and the corresponding change in total evapotranspiration. The model soil moisture data was compared with in-situ soil moisture data. The slope of the trend line relating the two data was 1.087 and correlation coefficient was 0.723 for the Spring, Summer and Fall season. The result of this study suggests that the LSM can be used as a powerful tool in developing precise and efficient water resources plans by providing accurate understanding on the spatio-temporal variation of hydrometeorological variables.

A Combined Heuristic Algorithm for Preference-based Shortest Path Search (선호도 기반 최단경로 탐색을 위한 휴리스틱 융합 알고리즘)

  • Ok, Seung-Ho;Ahn, Jin-Ho;Kang, Sung-Ho;Moon, Byung-In
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.8
    • /
    • pp.74-84
    • /
    • 2010
  • In this paper, we propose a preference-based shortest path algorithm which is combined with Ant Colony Optimization (ACO) and A* heuristic algorithm. In recent years, with the development of ITS (Intelligent Transportation Systems), there has been a resurgence of interest in a shortest path search algorithm for use in car navigation systems. Most of the shortest path search algorithms such as Dijkstra and A* aim at finding the distance or time shortest paths. However, the shortest path is not always an optimum path for the drivers who prefer choosing a less short, but more reliable or flexible path. For this reason, we propose a preference-based shortest path search algorithm which uses the properties of the links of the map. The preferences of the links are specified by the user of the car navigation system. The proposed algorithm was implemented in C and experiments were performed upon the map that includes 64 nodes with 118 links. The experimental results show that the proposed algorithm is suitable to find preference-based shortest paths as well as distance shortest paths.

Wide-area Surveillance Applicable Core Techniques on Ship Detection and Tracking Based on HF Radar Platform (광역감시망 적용을 위한 HF 레이더 기반 선박 검출 및 추적 요소 기술)

  • Cho, Chul Jin;Park, Sangwook;Lee, Younglo;Lee, Sangho;Ko, Hanseok
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.2_2
    • /
    • pp.313-326
    • /
    • 2018
  • This paper introduces core techniques on ship detection and tracking based on a compact HF radar platform which is necessary to establish a wide-area surveillance network. Currently, most HF radar sites are primarily optimized for observing sea surface radial velocities and bearings. Therefore, many ship detection systems are vulnerable to error sources such as environmental noise and clutter when they are applied to these practical surface current observation purpose systems. In addition, due to Korea's geographical features, only compact HF radars which generates non-uniform antenna response and has no information on target information are applicable. The ship detection and tracking techniques discussed in this paper considers these practical conditions and were evaluated by real data collected from the Yellow Sea, Korea. The proposed method is composed of two parts. In the first part, ship detection, a constant false alarm rate based detector was applied and was enhanced by a PCA subspace decomposition method which reduces noise. To merge multiple detections originated from a single target due to the Doppler effect during long CPIs, a clustering method was applied. Finally, data association framework eliminates false detections by considering ship maneuvering over time. According to evaluation results, it is claimed that the proposed method produces satisfactory results within certain ranges.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Estimation of GARCH Models and Performance Analysis of Volatility Trading System using Support Vector Regression (Support Vector Regression을 이용한 GARCH 모형의 추정과 투자전략의 성과분석)

  • Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.107-122
    • /
    • 2017
  • Volatility in the stock market returns is a measure of investment risk. It plays a central role in portfolio optimization, asset pricing and risk management as well as most theoretical financial models. Engle(1982) presented a pioneering paper on the stock market volatility that explains the time-variant characteristics embedded in the stock market return volatility. His model, Autoregressive Conditional Heteroscedasticity (ARCH), was generalized by Bollerslev(1986) as GARCH models. Empirical studies have shown that GARCH models describes well the fat-tailed return distributions and volatility clustering phenomenon appearing in stock prices. The parameters of the GARCH models are generally estimated by the maximum likelihood estimation (MLE) based on the standard normal density. But, since 1987 Black Monday, the stock market prices have become very complex and shown a lot of noisy terms. Recent studies start to apply artificial intelligent approach in estimating the GARCH parameters as a substitute for the MLE. The paper presents SVR-based GARCH process and compares with MLE-based GARCH process to estimate the parameters of GARCH models which are known to well forecast stock market volatility. Kernel functions used in SVR estimation process are linear, polynomial and radial. We analyzed the suggested models with KOSPI 200 Index. This index is constituted by 200 blue chip stocks listed in the Korea Exchange. We sampled KOSPI 200 daily closing values from 2010 to 2015. Sample observations are 1487 days. We used 1187 days to train the suggested GARCH models and the remaining 300 days were used as testing data. First, symmetric and asymmetric GARCH models are estimated by MLE. We forecasted KOSPI 200 Index return volatility and the statistical metric MSE shows better results for the asymmetric GARCH models such as E-GARCH or GJR-GARCH. This is consistent with the documented non-normal return distribution characteristics with fat-tail and leptokurtosis. Compared with MLE estimation process, SVR-based GARCH models outperform the MLE methodology in KOSPI 200 Index return volatility forecasting. Polynomial kernel function shows exceptionally lower forecasting accuracy. We suggested Intelligent Volatility Trading System (IVTS) that utilizes the forecasted volatility results. IVTS entry rules are as follows. If forecasted tomorrow volatility will increase then buy volatility today. If forecasted tomorrow volatility will decrease then sell volatility today. If forecasted volatility direction does not change we hold the existing buy or sell positions. IVTS is assumed to buy and sell historical volatility values. This is somewhat unreal because we cannot trade historical volatility values themselves. But our simulation results are meaningful since the Korea Exchange introduced volatility futures contract that traders can trade since November 2014. The trading systems with SVR-based GARCH models show higher returns than MLE-based GARCH in the testing period. And trading profitable percentages of MLE-based GARCH IVTS models range from 47.5% to 50.0%, trading profitable percentages of SVR-based GARCH IVTS models range from 51.8% to 59.7%. MLE-based symmetric S-GARCH shows +150.2% return and SVR-based symmetric S-GARCH shows +526.4% return. MLE-based asymmetric E-GARCH shows -72% return and SVR-based asymmetric E-GARCH shows +245.6% return. MLE-based asymmetric GJR-GARCH shows -98.7% return and SVR-based asymmetric GJR-GARCH shows +126.3% return. Linear kernel function shows higher trading returns than radial kernel function. Best performance of SVR-based IVTS is +526.4% and that of MLE-based IVTS is +150.2%. SVR-based GARCH IVTS shows higher trading frequency. This study has some limitations. Our models are solely based on SVR. Other artificial intelligence models are needed to search for better performance. We do not consider costs incurred in the trading process including brokerage commissions and slippage costs. IVTS trading performance is unreal since we use historical volatility values as trading objects. The exact forecasting of stock market volatility is essential in the real trading as well as asset pricing models. Further studies on other machine learning-based GARCH models can give better information for the stock market investors.