• Title/Summary/Keyword: Real-time Mining

Search Result 283, Processing Time 0.018 seconds

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

A Study on the Application of Outlier Analysis for Fraud Detection: Focused on Transactions of Auction Exception Agricultural Products (부정 탐지를 위한 이상치 분석 활용방안 연구 : 농수산 상장예외품목 거래를 대상으로)

  • Kim, Dongsung;Kim, Kitae;Kim, Jongwoo;Park, Steve
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.93-108
    • /
    • 2014
  • To support business decision making, interests and efforts to analyze and use transaction data in different perspectives are increasing. Such efforts are not only limited to customer management or marketing, but also used for monitoring and detecting fraud transactions. Fraud transactions are evolving into various patterns by taking advantage of information technology. To reflect the evolution of fraud transactions, there are many efforts on fraud detection methods and advanced application systems in order to improve the accuracy and ease of fraud detection. As a case of fraud detection, this study aims to provide effective fraud detection methods for auction exception agricultural products in the largest Korean agricultural wholesale market. Auction exception products policy exists to complement auction-based trades in agricultural wholesale market. That is, most trades on agricultural products are performed by auction; however, specific products are assigned as auction exception products when total volumes of products are relatively small, the number of wholesalers is small, or there are difficulties for wholesalers to purchase the products. However, auction exception products policy makes several problems on fairness and transparency of transaction, which requires help of fraud detection. In this study, to generate fraud detection rules, real huge agricultural products trade transaction data from 2008 to 2010 in the market are analyzed, which increase more than 1 million transactions and 1 billion US dollar in transaction volume. Agricultural transaction data has unique characteristics such as frequent changes in supply volumes and turbulent time-dependent changes in price. Since this was the first trial to identify fraud transactions in this domain, there was no training data set for supervised learning. So, fraud detection rules are generated using outlier detection approach. We assume that outlier transactions have more possibility of fraud transactions than normal transactions. The outlier transactions are identified to compare daily average unit price, weekly average unit price, and quarterly average unit price of product items. Also quarterly averages unit price of product items of the specific wholesalers are used to identify outlier transactions. The reliability of generated fraud detection rules are confirmed by domain experts. To determine whether a transaction is fraudulent or not, normal distribution and normalized Z-value concept are applied. That is, a unit price of a transaction is transformed to Z-value to calculate the occurrence probability when we approximate the distribution of unit prices to normal distribution. The modified Z-value of the unit price in the transaction is used rather than using the original Z-value of it. The reason is that in the case of auction exception agricultural products, Z-values are influenced by outlier fraud transactions themselves because the number of wholesalers is small. The modified Z-values are called Self-Eliminated Z-scores because they are calculated excluding the unit price of the specific transaction which is subject to check whether it is fraud transaction or not. To show the usefulness of the proposed approach, a prototype of fraud transaction detection system is developed using Delphi. The system consists of five main menus and related submenus. First functionalities of the system is to import transaction databases. Next important functions are to set up fraud detection parameters. By changing fraud detection parameters, system users can control the number of potential fraud transactions. Execution functions provide fraud detection results which are found based on fraud detection parameters. The potential fraud transactions can be viewed on screen or exported as files. The study is an initial trial to identify fraud transactions in Auction Exception Agricultural Products. There are still many remained research topics of the issue. First, the scope of analysis data was limited due to the availability of data. It is necessary to include more data on transactions, wholesalers, and producers to detect fraud transactions more accurately. Next, we need to extend the scope of fraud transaction detection to fishery products. Also there are many possibilities to apply different data mining techniques for fraud detection. For example, time series approach is a potential technique to apply the problem. Even though outlier transactions are detected based on unit prices of transactions, however it is possible to derive fraud detection rules based on transaction volumes.

A Study on Intelligent Value Chain Network System based on Firms' Information (기업정보 기반 지능형 밸류체인 네트워크 시스템에 관한 연구)

  • Sung, Tae-Eung;Kim, Kang-Hoe;Moon, Young-Su;Lee, Ho-Shin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.67-88
    • /
    • 2018
  • Until recently, as we recognize the significance of sustainable growth and competitiveness of small-and-medium sized enterprises (SMEs), governmental support for tangible resources such as R&D, manpower, funds, etc. has been mainly provided. However, it is also true that the inefficiency of support systems such as underestimated or redundant support has been raised because there exist conflicting policies in terms of appropriateness, effectiveness and efficiency of business support. From the perspective of the government or a company, we believe that due to limited resources of SMEs technology development and capacity enhancement through collaboration with external sources is the basis for creating competitive advantage for companies, and also emphasize value creation activities for it. This is why value chain network analysis is necessary in order to analyze inter-company deal relationships from a series of value chains and visualize results through establishing knowledge ecosystems at the corporate level. There exist Technology Opportunity Discovery (TOD) system that provides information on relevant products or technology status of companies with patents through retrievals over patent, product, or company name, CRETOP and KISLINE which both allow to view company (financial) information and credit information, but there exists no online system that provides a list of similar (competitive) companies based on the analysis of value chain network or information on potential clients or demanders that can have business deals in future. Therefore, we focus on the "Value Chain Network System (VCNS)", a support partner for planning the corporate business strategy developed and managed by KISTI, and investigate the types of embedded network-based analysis modules, databases (D/Bs) to support them, and how to utilize the system efficiently. Further we explore the function of network visualization in intelligent value chain analysis system which becomes the core information to understand industrial structure ystem and to develop a company's new product development. In order for a company to have the competitive superiority over other companies, it is necessary to identify who are the competitors with patents or products currently being produced, and searching for similar companies or competitors by each type of industry is the key to securing competitiveness in the commercialization of the target company. In addition, transaction information, which becomes business activity between companies, plays an important role in providing information regarding potential customers when both parties enter similar fields together. Identifying a competitor at the enterprise or industry level by using a network map based on such inter-company sales information can be implemented as a core module of value chain analysis. The Value Chain Network System (VCNS) combines the concepts of value chain and industrial structure analysis with corporate information simply collected to date, so that it can grasp not only the market competition situation of individual companies but also the value chain relationship of a specific industry. Especially, it can be useful as an information analysis tool at the corporate level such as identification of industry structure, identification of competitor trends, analysis of competitors, locating suppliers (sellers) and demanders (buyers), industry trends by item, finding promising items, finding new entrants, finding core companies and items by value chain, and recognizing the patents with corresponding companies, etc. In addition, based on the objectivity and reliability of the analysis results from transaction deals information and financial data, it is expected that value chain network system will be utilized for various purposes such as information support for business evaluation, R&D decision support and mid-term or short-term demand forecasting, in particular to more than 15,000 member companies in Korea, employees in R&D service sectors government-funded research institutes and public organizations. In order to strengthen business competitiveness of companies, technology, patent and market information have been provided so far mainly by government agencies and private research-and-development service companies. This service has been presented in frames of patent analysis (mainly for rating, quantitative analysis) or market analysis (for market prediction and demand forecasting based on market reports). However, there was a limitation to solving the lack of information, which is one of the difficulties that firms in Korea often face in the stage of commercialization. In particular, it is much more difficult to obtain information about competitors and potential candidates. In this study, the real-time value chain analysis and visualization service module based on the proposed network map and the data in hands is compared with the expected market share, estimated sales volume, contact information (which implies potential suppliers for raw material / parts, and potential demanders for complete products / modules). In future research, we intend to carry out the in-depth research for further investigating the indices of competitive factors through participation of research subjects and newly developing competitive indices for competitors or substitute items, and to additively promoting with data mining techniques and algorithms for improving the performance of VCNS.