Search | Korea Science

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

Kim, Kyoung-Jae;Ahn, Hyun-Chul
- Journal of Intelligence and Information Systems
- /
- v.17 no.4
- /
- pp.241-254
- /
- 2011
Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.
https://doi.org/10.13088/jiis.2011.17.4.241 인용 PDF KSCI

A Study on Daytime Transparent Cloud Detection through Machine Learning: Using GK-2A/AMI (기계학습을 통한 주간 반투명 구름탐지 연구: GK-2A/AMI를 이용하여)

Byeon, Yugyeong;Jin, Donghyun;Seong, Noh-hun;Woo, Jongho;Jeon, Uujin;Han, Kyung-Soo
- Korean Journal of Remote Sensing
- /
- v.38 no.6_1
- /
- pp.1181-1189
- /
- 2022
Clouds are composed of tiny water droplets, ice crystals, or mixtures suspended in the atmosphere and cover about two-thirds of the Earth's surface. Cloud detection in satellite images is a very difficult task to separate clouds and non-cloud areas because of similar reflectance characteristics to some other ground objects or the ground surface. In contrast to thick clouds, which have distinct characteristics, thin transparent clouds have weak contrast between clouds and background in satellite images and appear mixed with the ground surface. In order to overcome the limitations of transparent clouds in cloud detection, this study conducted cloud detection focusing on transparent clouds using machine learning techniques (Random Forest [RF], Convolutional Neural Networks [CNN]). As reference data, Cloud Mask and Cirrus Mask were used in MOD35 data provided by MOderate Resolution Imaging Spectroradiometer (MODIS), and the pixel ratio of training data was configured to be about 1:1:1 for clouds, transparent clouds, and clear sky for model training considering transparent cloud pixels. As a result of the qualitative comparison of the study, bothRF and CNN successfully detected various types of clouds, including transparent clouds, and in the case of RF+CNN, which mixed the results of the RF model and the CNN model, the cloud detection was well performed, and was confirmed that the limitations of the model were improved. As a quantitative result of the study, the overall accuracy (OA) value of RF was 92%, CNN showed 94.11%, and RF+CNN showed 94.29% accuracy.
https://doi.org/10.7780/kjrs.2022.38.6.1.15 인용 PDF KSCI HTML

A Study for Factors Influencing the Usage Increase and Decrease of Mobile Data Service: Based on The Two Factor Theory (모바일 데이터 서비스 사용량 증감에 영향을 미치는 요인들에 관한 연구: 이요인 이론(Two Factor Theory)을 바탕으로)

Lee, Sang-Hoon;Kim, Il-Kyung;Lee, Ho-Geun;Park, Hyun-Jee
- Asia pacific journal of information systems
- /
- v.17 no.2
- /
- pp.97-122
- /
- 2007
Conventional networking and telecommunications infrastructure characterized by wires, fixed location, and inflexibility is giving way to mobile technologies. Numerous research reports point to the ultimate domination of wireless communication. With the increasing prevalence of advanced cell-phones, various mobile data services (hereafter MDS) are gaining popularity. Although cellular networks were originally introduced for voice communications, statistics indicate that data services are replacing the matured voice service as the growth engine for telecom service providers. For example, SK Telecom, the Korea's largest mobile service provider, reported that 25.6% of revenue and 28.5% of profit came from MDS in 2006 and the share is growing. Statistics also indicate that, in 2006, the average revenue per user (ARPU) for voice didn't change but MDS grew seven percents from the previous year, further highlighting its growth potential. MDS is defined "as an assortment of digital data services that can be accessed using a mobile device over a wide geographic area." A variety of MDS have been deployed, with a few reaching the status of killer applications. Many of them need to access the Internet through the cellular-phone infrastructure. In the past, when the cellular network didn't have acceptable bandwidth for data services, SMS (short messaging service) dominated MDS. Now, Internet-ready, next-generation cell-phones are driving rich digital data services into the fabric of everyday life, These include news on various topics, Internet search, mapping and location-based information, mobile banking and gaming, downloading (i.e., screen savers), multimedia streaming, and various communication services (i.e., email, short messaging, messenger, and chaffing). The huge economic stake MDS has on its stakeholders warrants focused research to understand associated dynamics behind its adoption. Lyytinen and Yoo(2002) pointed out the limitation of traditional adoption models in explaining the rapid diffusion of innovations such as P2P or mobile services. Also, despite the increasing popularity of MDS, unexpected drop in its usage is observed among some people. Intrigued by these observations, an exploratory study was conducted to examine decision factors of MDS usage. Data analysis revealed that the increase and decrease of MDS use was influenced by different forces. The findings of the exploratory study triggered our confirmatory research effort to validate the uni-directionality of studied factors in affecting MDS usage. This differs from extant studies of IS/IT adoption that are largely grounded on the assumption of bi-directionality of explanatory variables in determining the level of dependent variables (i.e., user satisfaction, service usage). The research goal is, therefore, to examine if increase and decrease in the usage of MDS are explained by two separate groups of variables pertaining to information quality and system quality. For this, we investigate following research questions: (1) Does the information quality of MDS increase service usage?; (2) Does the system quality of MDS decrease service usage?; and (3) Does user motivation for subscribing MDS moderate the effect information and system quality have on service usage? The research questions and subsequent analysis are grounded on the two factor theory pioneered by Hertzberg et al(1959). To answer the research questions, in the first, an exploratory study based on 378 survey responses was conducted to learn about important decision factors of MDS usage. It revealed discrepancy between the influencing forces of usage increase and those of usage decrease. Based on the findings from the exploratory study and the two-factor theory, we postulated information quality as the motivator and system quality as the de-motivator (or hygiene) of MDS. Then, a confirmative study was undertaken on their respective role in encouraging and discouraging the usage of mobile data service.
PDF KSCI

Analyzing animation techniques used in webtoons and their potential issues (웹툰 연출의 애니메이션 기법활용과 문제점 분석)

Kim, Yu-mi
- Cartoon and Animation Studies
- /
- s.46
- /
- pp.85-106
- /
- 2017
With the media's shift into the digital era in the 2000s, comic book publishers attempted a transition into the new medium by establishing a distribution structure using internet networks. But that effort shied from escaping the parallel-page reading structure of traditional comics. On the other hand, webtoons are showing divers changes by redesigning the structure of traditional sequential art media; they tend to separate and allot spaces according to the vertical scroll reading method of the internet browser and include animations, sound effects and background music. This trend is also in accordance with the preferences of modern readers. Modern society has complicated social structures with the development of various media; the public is therefore exposed to different stimuli and shows characteristics of differentiated perceptions. In other words, webtoons display more relevant and entertaining characteristics by inserting sounds and using moving texts and characters in specific frames, while traditional comics require an appreciation of withdrawal and immersion like other published media. Motions in webtoons are partially applied for dramatic tension or to create an effective expression of action. For example, hand-drawn animation is adopted to express motions by dividing motion images into many layers. Sounds are also utilized, such as background music with episode-related lyrics, melodies, ambient sounds and motion-related sound effects. In addition, webtoons provide readers with new amusement by giving tactile stimuli via the vibration of a smart phone. As stated above, the vertical direction, time-based nature of animation motions and tactile stimuli used in webtoons are differentiated from published comics. However, webtoons' utilization of innovative techniques hasn't yet reached its full potential. In addition to the fact that the software used for webtoon effects is operationally complex, this is a transitional phenomenon since there is still a lack of technical understanding of animation and sound application amongst the general public. For example, a sound might be programmed to play when a specific frame scrolls into view on the monitor, but the frame may be scrolled faster or slower than the author intended; in this case, sound can end before or after a reader sees the whole image. The motion of each frame is also programmed to start in a similar fashion. Therefore, a reader's scroll speed is related to the motion's speed. For this reason, motions might miss the intended timing and be unnatural because they are played out of context. Also, finished sound effects can disturb the concentration of readers. These problems come from a shortage of continuity; to solve these, naturally activated consecutive sounds or animations, like the simple rotation of joints when a character moves, is required.
https://doi.org/10.7230/KOSCAS.2017.46.085 인용 PDF KSCI

Recent Progress in Air Conditioning and Refrigeration Research - A Review of papers Published in the Korean Journal of Air-Conditioning and Refrigeration Engineering in 1998 and 1999 - (공기조화, 냉동 분야의 최근 연구 동향 - 1998년 1999년 학회지 논문에 대한 종합적 고찰 -)

이재헌;김광우;김병주;이재효;김우승;조형희;김민수
- Korean Journal of Air-Conditioning and Refrigeration Engineering
- /
- v.12 no.12
- /
- pp.1098-1125
- /
- 2000
A review on the papers published in the Korean Journal of Air-Conditioning and Refrigerating Engineering in 1998 and 1999 has been done. Focus has been put on current status of research in the aspect of heating, cooling, ventilation, sanitation and building environment. The conclusions are as follows. 1) A review of the recent studies on fluid flow, turbomachinery and pipe-network shows that many experimental investigations are conducted in applications of impingement jets. Researches on turbulent flows, pipe flows, pipe-networks are focused on analyses of practical systems and prediction of system performance. The results of noise reduction in the turbomachinery are also reported. 2) A review of the recent studies on heat transfer analysis and heat exchanger shows that there were many papers on the channel flow with the application to the design of heat exchanger in the heat transfer analysis. Various experimental and numerical papers on heat exchanger were also published, however, there were few papers available for the analysis of whole system including heat exchanger. 3) A review of the recent studies on heat pump system have focused on the multi-type system and the heat pump cycle to utilize treated sewage as the heat source. The defrosting and the frosting behaviors in the fin-tube heat exchanger is experimentally examined by several authors. Several papers on the ice storage cooling system are presented to show the dynamic simulation program and optimal operation conditions. The study on the micro heat pipes for the cooling of high power electronic components is carried out to examine the characteristics of heat and mass transfer processed. In addition to these, new type of separate thermosyphon is studied experimentally. 4) The recent studies on refrigeration/air conditioning system have focused on the system performance and efficiency for new alternative refrigerants. New systems operating with natural refrigerants are drawing lots of attention. In addition to these, evaporation and condensation heat transfer characteristics of traditional and new refrigerants are investigated for plain tubes and also for microfin tubes. Capillary tubes and orifice are main topics of research as expansion devices and studies on thermophysical properties of new refrigerants and refrigerant/oil mixtures are widely carried out. 5) A review of the recent studies on absorption cooling system shows that numerous experimental and analytical studies on the improvement of absorber performance have been presented. Dynamic analysis of compressor have been performed to understand its vibration characteristics. However research works on tow-phase flow and heat transfer, which could be encountered in the refrigeration system and various phase-change heat exchanger, were seemed to be insufficient. 6) A review of recent studies on duct system shows that the methods for circuit analysis, and flow balancing have been presented. Researches on ventilation are focused on the measurement of ventilation efficiency, and variation of ventilation efficiency with ventilation methods by numerous experimental and numerical studies. Furthermore, many studies have been conducted in real building in order to estimate indoor thermal environments. Many research works to get some information for cooling tower design have been performed but are insufficient. 7) A review on the recent studies on architectural thermal environment and building mechanical systems design shows that thermal comfort analysis is sitting environment, thermal performance analysis of Korean traditional building structures., and evaluation of building environmental load have been performed. However research works to improve the performance of mechanical system design and construction technology were seemed to be insufficient.
PDF

Deep Learning Approaches for Accurate Weed Area Assessment in Maize Fields (딥러닝 기반 옥수수 포장의 잡초 면적 평가)

Hyeok-jin Bak;Dongwon Kwon;Wan-Gyu Sang;Ho-young Ban;Sungyul Chang;Jae-Kyeong Baek;Yun-Ho Lee;Woo-jin Im;Myung-chul Seo;Jung-Il Cho
- Korean Journal of Agricultural and Forest Meteorology
- /
- v.25 no.1
- /
- pp.17-27
- /
- 2023
Weeds are one of the factors that reduce crop yield through nutrient and photosynthetic competition. Quantification of weed density are an important part of making accurate decisions for precision weeding. In this study, we tried to quantify the density of weeds in images of maize fields taken by unmanned aerial vehicle (UAV). UAV image data collection took place in maize fields from May 17 to June 4, 2021, when maize was in its early growth stage. UAV images were labeled with pixels from maize and those without and the cropped to be used as the input data of the semantic segmentation network for the maize detection model. We trained a model to separate maize from background using the deep learning segmentation networks DeepLabV3+, U-Net, Linknet, and FPN. All four models showed pixel accuracy of 0.97, and the mIOU score was 0.76 and 0.74 in DeepLabV3+ and U-Net, higher than 0.69 for Linknet and FPN. Weed density was calculated as the difference between the green area classified as ExGR (Excess green-Excess red) and the maize area predicted by the model. Each image evaluated for weed density was recombined to quantify and visualize the distribution and density of weeds in a wide range of maize fields. We propose a method to quantify weed density for accurate weeding by effectively separating weeds, maize, and background from UAV images of maize fields.
https://doi.org/10.5532/KJAFM.2023.25.1.17 인용 PDF

Development and application of prediction model of hyperlipidemia using SVM and meta-learning algorithm (SVM과 meta-learning algorithm을 이용한 고지혈증 유병 예측모형 개발과 활용)

Lee, Seulki;Shin, Taeksoo
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.111-124
- /
- 2018
This study aims to develop a classification model for predicting the occurrence of hyperlipidemia, one of the chronic diseases. Prior studies applying data mining techniques for predicting disease can be classified into a model design study for predicting cardiovascular disease and a study comparing disease prediction research results. In the case of foreign literatures, studies predicting cardiovascular disease were predominant in predicting disease using data mining techniques. Although domestic studies were not much different from those of foreign countries, studies focusing on hypertension and diabetes were mainly conducted. Since hypertension and diabetes as well as chronic diseases, hyperlipidemia, are also of high importance, this study selected hyperlipidemia as the disease to be analyzed. We also developed a model for predicting hyperlipidemia using SVM and meta learning algorithms, which are already known to have excellent predictive power. In order to achieve the purpose of this study, we used data set from Korea Health Panel 2012. The Korean Health Panel produces basic data on the level of health expenditure, health level and health behavior, and has conducted an annual survey since 2008. In this study, 1,088 patients with hyperlipidemia were randomly selected from the hospitalized, outpatient, emergency, and chronic disease data of the Korean Health Panel in 2012, and 1,088 nonpatients were also randomly extracted. A total of 2,176 people were selected for the study. Three methods were used to select input variables for predicting hyperlipidemia. First, stepwise method was performed using logistic regression. Among the 17 variables, the categorical variables(except for length of smoking) are expressed as dummy variables, which are assumed to be separate variables on the basis of the reference group, and these variables were analyzed. Six variables (age, BMI, education level, marital status, smoking status, gender) excluding income level and smoking period were selected based on significance level 0.1. Second, C4.5 as a decision tree algorithm is used. The significant input variables were age, smoking status, and education level. Finally, C4.5 as a decision tree algorithm is used. In SVM, the input variables selected by genetic algorithms consisted of 6 variables such as age, marital status, education level, economic activity, smoking period, and physical activity status, and the input variables selected by genetic algorithms in artificial neural network consist of 3 variables such as age, marital status, and education level. Based on the selected parameters, we compared SVM, meta learning algorithm and other prediction models for hyperlipidemia patients, and compared the classification performances using TP rate and precision. The main results of the analysis are as follows. First, the accuracy of the SVM was 88.4% and the accuracy of the artificial neural network was 86.7%. Second, the accuracy of classification models using the selected input variables through stepwise method was slightly higher than that of classification models using the whole variables. Third, the precision of artificial neural network was higher than that of SVM when only three variables as input variables were selected by decision trees. As a result of classification models based on the input variables selected through the genetic algorithm, classification accuracy of SVM was 88.5% and that of artificial neural network was 87.9%. Finally, this study indicated that stacking as the meta learning algorithm proposed in this study, has the best performance when it uses the predicted outputs of SVM and MLP as input variables of SVM, which is a meta classifier. The purpose of this study was to predict hyperlipidemia, one of the representative chronic diseases. To do this, we used SVM and meta-learning algorithms, which is known to have high accuracy. As a result, the accuracy of classification of hyperlipidemia in the stacking as a meta learner was higher than other meta-learning algorithms. However, the predictive performance of the meta-learning algorithm proposed in this study is the same as that of SVM with the best performance (88.6%) among the single models. The limitations of this study are as follows. First, various variable selection methods were tried, but most variables used in the study were categorical dummy variables. In the case with a large number of categorical variables, the results may be different if continuous variables are used because the model can be better suited to categorical variables such as decision trees than general models such as neural networks. Despite these limitations, this study has significance in predicting hyperlipidemia with hybrid models such as met learning algorithms which have not been studied previously. It can be said that the result of improving the model accuracy by applying various variable selection techniques is meaningful. In addition, it is expected that our proposed model will be effective for the prevention and management of hyperlipidemia.
https://doi.org/10.13088/jiis.2018.24.2.111 인용 PDF KSCI

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.63-83
- /
- 2019
Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.
https://doi.org/10.13088/jiis.2019.25.1.063 인용 PDF KSCI HTML

Search Result 158, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)