• Title/Summary/Keyword: Information Management Model

Search Result 8,990, Processing Time 0.046 seconds

A Case Study: Improvement of Wind Risk Prediction by Reclassifying the Detection Results (풍해 예측 결과 재분류를 통한 위험 감지확률의 개선 연구)

  • Kim, Soo-ock;Hwang, Kyu-Hong
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.3
    • /
    • pp.149-155
    • /
    • 2021
  • Early warning systems for weather risk management in the agricultural sector have been developed to predict potential wind damage to crops. These systems take into account the daily maximum wind speed to determine the critical wind speed that causes fruit drops and provide the weather risk information to farmers. In an effort to increase the accuracy of wind risk predictions, an artificial neural network for binary classification was implemented. In the present study, the daily wind speed and other weather data, which were measured at weather stations at sites of interest in Jeollabuk-do and Jeollanam-do as well as Gyeongsangbuk- do and part of Gyeongsangnam- do provinces in 2019, were used for training the neural network. These weather stations include 210 synoptic and automated weather stations operated by the Korean Meteorological Administration (KMA). The wind speed data collected at the same locations between January 1 and December 12, 2020 were used to validate the neural network model. The data collected from December 13, 2020 to February 18, 2021 were used to evaluate the wind risk prediction performance before and after the use of the artificial neural network. The critical wind speed of damage risk was determined to be 11 m/s, which is the wind speed reported to cause fruit drops and damages. Furthermore, the maximum wind speeds were expressed using Weibull distribution probability density function for warning of wind damage. It was found that the accuracy of wind damage risk prediction was improved from 65.36% to 93.62% after re-classification using the artificial neural network. Nevertheless, the error rate also increased from 13.46% to 37.64%, as well. It is likely that the machine learning approach used in the present study would benefit case studies where no prediction by risk warning systems becomes a relatively serious issue.

A Study on World University Evaluation Systems: Focusing on U-Multirank of the European Union (유럽연합의 세계 대학 평가시스템 '유-멀티랭크' 연구)

  • Lee, Tae-Young
    • Korean Journal of Comparative Education
    • /
    • v.27 no.4
    • /
    • pp.187-209
    • /
    • 2017
  • The purpose of this study was to highlight the necessity of a conceptual reestablishment of world university evaluations. The hitherto most well-known and validated world university evaluation systems such as Times Higher Education (THE), Quacquarelli Symonds (QS) or Academic Ranking of World Universities (ARWU) primarily assess big universities with quantitative evaluation indicators and performance results in the rankings. Those Systems have instigated a kind of elitism in higher education and neglect numerous small or local institutions of higher education, instead of providing stakeholders with comprehensive information about the real possibilities of tertiary education so that they can choose an institution that is individually tailored to their needs. Also, the management boards of universities and policymakers in higher education have partly been manipulated by and partly taken advantage of the elitist ranking systems with an economic emphasis, as indicated by research-centered evaluations and industry-university cooperation. To supplement such educational defects and to redress the lack of world university evaluation systems, a new system called 'U-Multirank' has been implemented with the financial support of the European Commission since 2012. U-Multirank was designed and is enforced by an international team of project experts led by CHE(Centre for Higher Education/Germany), CHEPS(Center for Higher Education Policy Studies/Netherlands) and CWTS(Centre for Science and Technology Studies at Leiden University/Netherlands). The significant features of U-Multirank, compared with e.g., THE and ARWU, are its qualitative, multidimensional, user-oriented and individualized assessment methods. Above all, its website and its assessment results, based on a mobile operating system and designed simply for international users, present a self-organized and evolutionary model of world university evaluation systems in the digital and global era. To estimate the universal validity of the redefinition of the world university evaluation system using U-Multirank, an epistemological approach will be used that relies on Edgar Morin's Complexity Theory and Karl Popper's Philosophy of Science.

Data-driven Analysis for Developing the Effective Groundwater Management System in Daejeong-Hangyeong Watershed in Jeju Island (제주도 대정-한경 유역 효율적 지하수자원 관리를 위한 자료기반 연구)

  • Lee, Soyeon;Jeong, Jiho;Kim, Minchul;Park, Wonbae;Kim, Yuhan;Park, Jaesung;Park, Heejeong;Park, Gyeongtae;Jeong, Jina
    • Economic and Environmental Geology
    • /
    • v.54 no.3
    • /
    • pp.373-387
    • /
    • 2021
  • In this study, the impact of clustered groundwater usage facilities and the proper amount of groundwater usage in the Daejeong-Hangyeong watershed of Jeju island were evaluated based on the data-driven analysis methods. As the applied data, groundwater level data; the corresponding precipitation data; the groundwater usage amount data (Jeoji, Geumak, Seogwang, and English-education city facilities) were used. The results show that the Geumak usage facility has a large influence centering on the corresponding location; the Seogwang usage facility affects on the downstream area; the English-education usage facility has a great impact around the upstream of the location; the Jeoji usage facility shows an influence around the up- and down-streams of the location. Overall, the influence of operating the clustered groundwater usage facilities in the watershed is prolonged to approximately 5km. Additionally, the appropriate groundwater usage amount to maintain the groundwater base-level was analyzed corresponding to the precipitation. Considering the recent precipitation pattern, there is a need to limit the current amount of groundwater usage to 80%. With increasing the precipitation by 100mm, additional groundwater development of approximately 1,500m3-1,900m3 would be reasonable. All the results of the developed data-driven estimation model can be used as useful information for sustainable groundwater development in the Daejeong-Hangyeong watershed of Jeju island.

A Comparative Study on the Growth Performance of Korean Indigenous Chicken Pure Line by Sex and Twelve Strains (토종닭 순계 12계통과 성별에 따른 성장능력 비교 연구)

  • Kim, Kigon;Park, Byoungho;Jeon, Iksoo;Choo, Hyojun;Ham, Jinjoo;Park, Keon;Cha, Jaebeom
    • Korean Journal of Poultry Science
    • /
    • v.48 no.4
    • /
    • pp.193-206
    • /
    • 2021
  • This study aimed to identify the growth performance of Korean indigenous chicken pure-line by sex and twelve strains conserved in Poultry Research Institute, National Institute of Animal Science, Rural Development Administration. The effect of sex and strain on body weight was significantly different in every period, with males being heavier in all periods than females. In the case of biweekly weight gain, the tendency to increase rapidly from birth to six weeks old, and to decrease in the period from twelve to fourteen weeks old was common across all sex and strains. Depending on sex and strain, there were significant differences in age and the number of peaks. Regardless of sex and strain, the determination coefficient and adjusted determination coefficient showed high goodness of fit (99.1~99.9%) to growth functions. However, for each model, the goodness-of-fit had variations by sex and strains. von Betalanffy function had the best fit to growth curves in all the female strains except strain D. On the other hand, Gompertz function had the best fit for all the male strains except strain C. Logistic function showed the lowest goodness-of-fit in all sex and strains. Mature weights were in the order of von bertalanffy, Gompertz, and Logistic models, while growth ratio and maturing rate followed the order of logistic, gompertz, and von bertalanffy functions. This information could be useful for Korean indigenous chicken management and designing crossbreeding tests and breeding programs.

A Study on Estimation of Environmental Value of Tentatively Named 'East-West Trail' Using CVM (CVM기법을 이용한 가칭 '동서트레일'의 환경가치 추정)

  • Kee-Rae Kang;Yoon-Ho Choi;Bo-Kwang Chung;Dong-Pil Kim;Hyun-Kyeong Oh;Woo-Sung Lee;Su-Bok Chae
    • Korean Journal of Environment and Ecology
    • /
    • v.36 no.6
    • /
    • pp.676-683
    • /
    • 2022
  • Due to the effects of rapid changes in the living environment since 2000 and the recent unforeseen pandemic, people are refraining from domestic and international traveling and movement, and outdoor activities for health and the public value of forest trails, called Dullegil Trail in Korea, have become more important. This study estimated the environmental value of the tentatively named "East-West Trail," which connects the forest trails crossing Chungcheong and Gyeongsang Provinces using CVM (Contingent Valuation Method). It surveyed visitors to the East-West Trail, and 725 questionnaires were used for analysis. The average characteristics of respondents were those who exercised 2-3 times per week, visited a forest trail not far from their residence with friends or family, and showed a tendency to spend 50 thousand Korean won or more per visit. Visitors to the Dullegil Trail felt that there was a shortage of information boards on the forest trail, and they preferred a shelter in appropriate locations. We used a double-bounded dichotomous choice (BDDC) logit model proposed by Hanemann to measure the conservation value of the East-West Trail. It was estimated that the environmental value that a visitor to the East-West Trail could obtain was 30,087 won per trip. The estimated environmental value of the East-West Trail can be converted to about 94 billion won total visitors annually based on the population belonging to the direct-use zone near the East-West Trail. As there has been no study on the environmental value of forest trails using CVM, the results of this study will be able to suggest the feasibility of the government policies on forest trails.

Effects of Seller's Influence Tactics on Customer's Psychological Obligation, Trust, and Repurchase Intention in Offline Cosmetics Selling Channel: Moderating Effect of Perceived Service Quality (오프라인 화장품 구매경로에서 판매원의 판매설득전술이 고객의 심리적의무감과 판매원 신뢰, 재구매의도에 미치는 영향: 지각된 서비스 품질을 조절효과로)

  • Kang, Byeong Jun;Yi, Ho-Taek
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.5
    • /
    • pp.205-221
    • /
    • 2022
  • In this study, the authors investigated the effect of salesperson's Selling Influence Tactics (SIT) on customers' psychological obligation, trust in salespersons and repurchase intentions in the offline cosmetics purchase channel. In addition, we examined the moderating effect of service quality perceived by customers. To this end, a survey was conducted on 298 customers who had purchased cosmetics through the offline sales channel, and the authors conducted hypothesis testing through a structural equation model. As a result of the study, first, among salesperson's sales influence tactics, emotional appeal tactics (H1a), customer ingratiation tactics (H1d), and personal appeal tactics (H1e) were found to affect the psychological obligation of customers, and emotional appeal tactics (H2a), rational persuasion tactics (H2b), information provision tactics (H2c), and customer ingratiation tactics (H2d) were found to affect trust in salespeople. Third, it was found that the psychological obligation did not have a positive (+) effect on the customer's repurchase intention, and the customer's trust in the salesperson had a positive (+) effect on the repurchase intention. Third, perceived service quality showed a significant moderating effect between psychological obligation and repurchase intention, trust in salesperson and repurchase intention. In previous studies on salesperson's Selling Influence Tactics (SIT), many studies examined salesperson's Selling Influence Tactics (SIT) by specifying sub-variables in a limited way, and studies confirming marketing factors such as repurchase intention were also insufficient. Therefore, the results of the empirical research confirmed based on this study are expected to help the standard or direction of the salesperson's Selling Influence Tactics (SIT) in future studies. In addition, this study describes implications for providing help in employee education and management for small business owners who manage and operate offline cosmetics stores, and sales strategies that should be strategically established to improve perceived service quality for customers.

Home Economics teachers' concern on creativity and personality education in Home Economics classes: Based on the concerns based adoption model(CBAM) (가정과 교사의 창의.인성 교육에 대한 관심과 실행에 대한 인식 - CBAM 모형에 기초하여-)

  • Lee, In-Sook;Park, Mi-Jeong;Chae, Jung-Hyun
    • Journal of Korean Home Economics Education Association
    • /
    • v.24 no.2
    • /
    • pp.117-134
    • /
    • 2012
  • The purpose of this study was to identify the stage of concern, the level of use, and the innovation configuration of Home Economics teachers regarding creativity and personality education in Home Economics(HE) classes. The survey questionnaires were sent through mails and e-mails to middle-school HE teachers in the whole country selected by systematic sampling and convenience sampling. Questionnaires of the stages of concern and the levels of use developed by Hall(1987) were used in this study. 187 data were used for the final analysis by using SPSS/window(12.0) program. The results of the study were as following: First, for the stage of concerns of HE teachers on creativity and personality education, the information stage of concerns(85.51) was the one with the highest response rate and the next high in the following order: the management stage of concerns(81.88), the awareness stage of concerns(82.15), the refocusing stage of concerns(68.80), the collaboration stage of concerns(61.97), and the consequence stage of concerns(59.76). Second, the levels of use of HE teachers on creativity and personality education was highest with the mechanical levels(level 3; 21.4%) and the next high in the following order: the orientation levels of use(level 1; 20.9%), the refinement levels(level 5; 17.1%), the non-use levels(level 0; 15.0%), the preparation levels(level 2; 10.2%), the integration levels(level 6; 5.9%), the renewal levels(level 7; 4.8%), the routine levels(level 4; 4.8%). Third, for the innovation configuration of HE teachers on creativity and personality education, more than half of the HE teachers(56.1%) mainly focused on personality education in their HE classes; 31.0% of the HE teachers performed both creativity and personality education; a small number of teachers(6.4%) focused on creativity education; the same number of teachers(6.4%) responded that they do not focus on neither of the two. Examining the level and type of performance HE teachers applied, the average score on the performance of creativity and personality education was 3.76 out of 5.00 and the mean of creativity component was 3.59 and of personality component was 3.94, higher than standard. For the creativity education, openness/sensitivity(3.97) education was performed most and the next most in the following order: problem-solving skill(3.79), curiosity/interest(3.73), critical thinking(3.63), problem-finding skill(3.61), originality(3.57), analogy(3.47), fluency/adaptability(3.46), precision(3.46), imagination(3.37), and focus/sympathy(3.37). For the personality education, the following components were performed in order from most to least: power of execution(4.07), cooperation/consideration/just(4.06), self-management skill(4.04), civic consciousness(4.04), career development ability(4.03), environment adaptability(3.95), responsibility/ownership(3.94), decision making(3.89), trust/honesty/promise(3.88), autonomy(3.86), and global competency(3.55). Regarding what makes performing creativity and personality education difficult, most HE teachers(64.71%) chose the lack of instructional materials and 40.11% of participants chose the lack of seminar and workshop opportunity. 38.5% chose the difficulty of developing an evaluation criteria or an evaluation tool while 25.67% responded that they do not know any means of performing creativity and personality education. Regarding the better way to support for creativity and personality education, the HE teachers chose in order from most to least: 'expansion of hands-on activities for students related to education on creativity and personality'(4.34), 'development of HE classroom culture putting emphasis on creativity and personality'(4.29), 'a proper curriculum on creativity and personality education that goes along with students' developmental stages'(4.27), 'securing enough human resource and number of professors who will conduct creativity and personality education'(4.21), 'establishment of the concept and value of the education on creativity and personality'(4.09), and 'educational promotion on creativity and personality education supported by local communities and companies'(3.94).

  • PDF

A Study on the Meaning and Strategy of Keyword Advertising Marketing

  • Park, Nam Goo
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.49-56
    • /
    • 2010
  • At the initial stage of Internet advertising, banner advertising came into fashion. As the Internet developed into a central part of daily lives and the competition in the on-line advertising market was getting fierce, there was not enough space for banner advertising, which rushed to portal sites only. All these factors was responsible for an upsurge in advertising prices. Consequently, the high-cost and low-efficiency problems with banner advertising were raised, which led to an emergence of keyword advertising as a new type of Internet advertising to replace its predecessor. In the beginning of 2000s, when Internet advertising came to be activated, display advertisement including banner advertising dominated the Net. However, display advertising showed signs of gradual decline, and registered minus growth in the year 2009, whereas keyword advertising showed rapid growth and started to outdo display advertising as of the year 2005. Keyword advertising refers to the advertising technique that exposes relevant advertisements on the top of research sites when one searches for a keyword. Instead of exposing advertisements to unspecified individuals like banner advertising, keyword advertising, or targeted advertising technique, shows advertisements only when customers search for a desired keyword so that only highly prospective customers are given a chance to see them. In this context, it is also referred to as search advertising. It is regarded as more aggressive advertising with a high hit rate than previous advertising in that, instead of the seller discovering customers and running an advertisement for them like TV, radios or banner advertising, it exposes advertisements to visiting customers. Keyword advertising makes it possible for a company to seek publicity on line simply by making use of a single word and to achieve a maximum of efficiency at a minimum cost. The strong point of keyword advertising is that customers are allowed to directly contact the products in question through its more efficient advertising when compared to the advertisements of mass media such as TV and radio, etc. The weak point of keyword advertising is that a company should have its advertisement registered on each and every portal site and finds it hard to exercise substantial supervision over its advertisement, there being a possibility of its advertising expenses exceeding its profits. Keyword advertising severs as the most appropriate methods of advertising for the sales and publicity of small and medium enterprises which are in need of a maximum of advertising effect at a low advertising cost. At present, keyword advertising is divided into CPC advertising and CPM advertising. The former is known as the most efficient technique, which is also referred to as advertising based on the meter rate system; A company is supposed to pay for the number of clicks on a searched keyword which users have searched. This is representatively adopted by Overture, Google's Adwords, Naver's Clickchoice, and Daum's Clicks, etc. CPM advertising is dependent upon the flat rate payment system, making a company pay for its advertisement on the basis of the number of exposure, not on the basis of the number of clicks. This method fixes a price for advertisement on the basis of 1,000-time exposure, and is mainly adopted by Naver's Timechoice, Daum's Speciallink, and Nate's Speedup, etc, At present, the CPC method is most frequently adopted. The weak point of the CPC method is that advertising cost can rise through constant clicks from the same IP. If a company makes good use of strategies for maximizing the strong points of keyword advertising and complementing its weak points, it is highly likely to turn its visitors into prospective customers. Accordingly, an advertiser should make an analysis of customers' behavior and approach them in a variety of ways, trying hard to find out what they want. With this in mind, her or she has to put multiple keywords into use when running for ads. When he or she first runs an ad, he or she should first give priority to which keyword to select. The advertiser should consider how many individuals using a search engine will click the keyword in question and how much money he or she has to pay for the advertisement. As the popular keywords that the users of search engines are frequently using are expensive in terms of a unit cost per click, the advertisers without much money for advertising at the initial phrase should pay attention to detailed keywords suitable to their budget. Detailed keywords are also referred to as peripheral keywords or extension keywords, which can be called a combination of major keywords. Most keywords are in the form of texts. The biggest strong point of text-based advertising is that it looks like search results, causing little antipathy to it. But it fails to attract much attention because of the fact that most keyword advertising is in the form of texts. Image-embedded advertising is easy to notice due to images, but it is exposed on the lower part of a web page and regarded as an advertisement, which leads to a low click through rate. However, its strong point is that its prices are lower than those of text-based advertising. If a company owns a logo or a product that is easy enough for people to recognize, the company is well advised to make good use of image-embedded advertising so as to attract Internet users' attention. Advertisers should make an analysis of their logos and examine customers' responses based on the events of sites in question and the composition of products as a vehicle for monitoring their behavior in detail. Besides, keyword advertising allows them to analyze the advertising effects of exposed keywords through the analysis of logos. The logo analysis refers to a close analysis of the current situation of a site by making an analysis of information about visitors on the basis of the analysis of the number of visitors and page view, and that of cookie values. It is in the log files generated through each Web server that a user's IP, used pages, the time when he or she uses it, and cookie values are stored. The log files contain a huge amount of data. As it is almost impossible to make a direct analysis of these log files, one is supposed to make an analysis of them by using solutions for a log analysis. The generic information that can be extracted from tools for each logo analysis includes the number of viewing the total pages, the number of average page view per day, the number of basic page view, the number of page view per visit, the total number of hits, the number of average hits per day, the number of hits per visit, the number of visits, the number of average visits per day, the net number of visitors, average visitors per day, one-time visitors, visitors who have come more than twice, and average using hours, etc. These sites are deemed to be useful for utilizing data for the analysis of the situation and current status of rival companies as well as benchmarking. As keyword advertising exposes advertisements exclusively on search-result pages, competition among advertisers attempting to preoccupy popular keywords is very fierce. Some portal sites keep on giving priority to the existing advertisers, whereas others provide chances to purchase keywords in question to all the advertisers after the advertising contract is over. If an advertiser tries to rely on keywords sensitive to seasons and timeliness in case of sites providing priority to the established advertisers, he or she may as well make a purchase of a vacant place for advertising lest he or she should miss appropriate timing for advertising. However, Naver doesn't provide priority to the existing advertisers as far as all the keyword advertisements are concerned. In this case, one can preoccupy keywords if he or she enters into a contract after confirming the contract period for advertising. This study is designed to take a look at marketing for keyword advertising and to present effective strategies for keyword advertising marketing. At present, the Korean CPC advertising market is virtually monopolized by Overture. Its strong points are that Overture is based on the CPC charging model and that advertisements are registered on the top of the most representative portal sites in Korea. These advantages serve as the most appropriate medium for small and medium enterprises to use. However, the CPC method of Overture has its weak points, too. That is, the CPC method is not the only perfect advertising model among the search advertisements in the on-line market. So it is absolutely necessary that small and medium enterprises including independent shopping malls should complement the weaknesses of the CPC method and make good use of strategies for maximizing its strengths so as to increase their sales and to create a point of contact with customers.

  • PDF

Estimation of GARCH Models and Performance Analysis of Volatility Trading System using Support Vector Regression (Support Vector Regression을 이용한 GARCH 모형의 추정과 투자전략의 성과분석)

  • Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.107-122
    • /
    • 2017
  • Volatility in the stock market returns is a measure of investment risk. It plays a central role in portfolio optimization, asset pricing and risk management as well as most theoretical financial models. Engle(1982) presented a pioneering paper on the stock market volatility that explains the time-variant characteristics embedded in the stock market return volatility. His model, Autoregressive Conditional Heteroscedasticity (ARCH), was generalized by Bollerslev(1986) as GARCH models. Empirical studies have shown that GARCH models describes well the fat-tailed return distributions and volatility clustering phenomenon appearing in stock prices. The parameters of the GARCH models are generally estimated by the maximum likelihood estimation (MLE) based on the standard normal density. But, since 1987 Black Monday, the stock market prices have become very complex and shown a lot of noisy terms. Recent studies start to apply artificial intelligent approach in estimating the GARCH parameters as a substitute for the MLE. The paper presents SVR-based GARCH process and compares with MLE-based GARCH process to estimate the parameters of GARCH models which are known to well forecast stock market volatility. Kernel functions used in SVR estimation process are linear, polynomial and radial. We analyzed the suggested models with KOSPI 200 Index. This index is constituted by 200 blue chip stocks listed in the Korea Exchange. We sampled KOSPI 200 daily closing values from 2010 to 2015. Sample observations are 1487 days. We used 1187 days to train the suggested GARCH models and the remaining 300 days were used as testing data. First, symmetric and asymmetric GARCH models are estimated by MLE. We forecasted KOSPI 200 Index return volatility and the statistical metric MSE shows better results for the asymmetric GARCH models such as E-GARCH or GJR-GARCH. This is consistent with the documented non-normal return distribution characteristics with fat-tail and leptokurtosis. Compared with MLE estimation process, SVR-based GARCH models outperform the MLE methodology in KOSPI 200 Index return volatility forecasting. Polynomial kernel function shows exceptionally lower forecasting accuracy. We suggested Intelligent Volatility Trading System (IVTS) that utilizes the forecasted volatility results. IVTS entry rules are as follows. If forecasted tomorrow volatility will increase then buy volatility today. If forecasted tomorrow volatility will decrease then sell volatility today. If forecasted volatility direction does not change we hold the existing buy or sell positions. IVTS is assumed to buy and sell historical volatility values. This is somewhat unreal because we cannot trade historical volatility values themselves. But our simulation results are meaningful since the Korea Exchange introduced volatility futures contract that traders can trade since November 2014. The trading systems with SVR-based GARCH models show higher returns than MLE-based GARCH in the testing period. And trading profitable percentages of MLE-based GARCH IVTS models range from 47.5% to 50.0%, trading profitable percentages of SVR-based GARCH IVTS models range from 51.8% to 59.7%. MLE-based symmetric S-GARCH shows +150.2% return and SVR-based symmetric S-GARCH shows +526.4% return. MLE-based asymmetric E-GARCH shows -72% return and SVR-based asymmetric E-GARCH shows +245.6% return. MLE-based asymmetric GJR-GARCH shows -98.7% return and SVR-based asymmetric GJR-GARCH shows +126.3% return. Linear kernel function shows higher trading returns than radial kernel function. Best performance of SVR-based IVTS is +526.4% and that of MLE-based IVTS is +150.2%. SVR-based GARCH IVTS shows higher trading frequency. This study has some limitations. Our models are solely based on SVR. Other artificial intelligence models are needed to search for better performance. We do not consider costs incurred in the trading process including brokerage commissions and slippage costs. IVTS trading performance is unreal since we use historical volatility values as trading objects. The exact forecasting of stock market volatility is essential in the real trading as well as asset pricing models. Further studies on other machine learning-based GARCH models can give better information for the stock market investors.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.