• Title/Summary/Keyword: input-output data

Search Result 2,313, Processing Time 0.036 seconds

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Finite Element Method Modeling for Individual Malocclusions: Development and Application of the Basic Algorithm (유한요소법을 이용한 환자별 교정시스템 구축의 기초 알고리즘 개발과 적용)

  • Shin, Jung-Woog;Nahm, Dong-Seok;Kim, Tae-Woo;Lee, Sung Jae
    • The korean journal of orthodontics
    • /
    • v.27 no.5 s.64
    • /
    • pp.815-824
    • /
    • 1997
  • The purpose of this study is to develop the basic algorithm for the finite element method modeling of individual malocclusions. Usually, a great deal of time is spent in preprocessing. To reduce the time required, we developed a standardized procedure for measuring the position of each tooth and a program to automatically preprocess. The following procedures were carried to complete this study. 1. Twenty-eight teeth morphologies were constructed three-dimensionally for the finite element analysis and saved as separate files. 2. Standard brackets were attached so that the FA points coincide with the center of the brackets. 3. The study model of a patient was made. 4. Using the study model, the crown inclination, angulation, and the vertical distance from the tip of a tooth was measured by using specially designed tools. 5. The arch form was determined from a picture of the model with an image processing technique. 6. The measured data were input as a rotational matrix. 7. The program provides an output file containing the necessary information about the three-dimensional position of teeth, which is applicable to several finite element programs commonly used. The program for a basic algorithm was made with Turbo-C and the subsequent outfile was applied to ANSYS. This standardized model measuring procedure and the program reduce the time required, especially for preprocessing and can be applied to other malocclusions easily.

  • PDF

Direct Reconstruction of Displaced Subdivision Mesh from Unorganized 3D Points (연결정보가 없는 3차원 점으로부터 차이분할메쉬 직접 복원)

  • Jung, Won-Ki;Kim, Chang-Heon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.6
    • /
    • pp.307-317
    • /
    • 2002
  • In this paper we propose a new mesh reconstruction scheme that produces a displaced subdivision surface directly from unorganized points. The displaced subdivision surface is a new mesh representation that defines a detailed mesh with a displacement map over a smooth domain surface, but original displaced subdivision surface algorithm needs an explicit polygonal mesh since it is not a mesh reconstruction algorithm but a mesh conversion (remeshing) algorithm. The main idea of our approach is that we sample surface detail from unorganized points without any topological information. For this, we predict a virtual triangular face from unorganized points for each sampling ray from a parameteric domain surface. Direct displaced subdivision surface reconstruction from unorganized points has much importance since the output of this algorithm has several important properties: It has compact mesh representation since most vertices can be represented by only a scalar value. Underlying structure of it is piecewise regular so it ran be easily transformed into a multiresolution mesh. Smoothness after mesh deformation is automatically preserved. We avoid time-consuming global energy optimization by employing the input data dependant mesh smoothing, so we can get a good quality displaced subdivision surface quickly.

A Scalable and Modular Approach to Understanding of Real-time Software: An Architecture-based Software Understanding(ARSU) and the Software Re/reverse-engineering Environment(SRE) (실시간 소프트웨어의 조절적${\cdot}$단위적 이해 방법 : ARSU(Architecture-based Software Understanding)와 SRE(Software Re/reverse-engineering Environment))

  • Lee, Moon-Kun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.12
    • /
    • pp.3159-3174
    • /
    • 1997
  • This paper reports a research to develop a methodology and a tool for understanding of very large and complex real-time software. The methodology and the tool mostly developed by the author are called the Architecture-based Real-time Software Understanding (ARSU) and the Software Re/reverse-engineering Environment (SRE) respectively. Due to size and complexity, it is commonly very hard to understand the software during reengineering process. However the research facilitates scalable re/reverse-engineering of such real-time software based on the architecture of the software in three-dimensional perspectives: structural, functional, and behavioral views. Firstly, the structural view reveals the overall architecture, specification (outline), and the algorithm (detail) views of the software, based on hierarchically organized parent-chi1d relationship. The basic building block of the architecture is a software Unit (SWU), generated by user-defined criteria. The architecture facilitates navigation of the software in top-down or bottom-up way. It captures the specification and algorithm views at different levels of abstraction. It also shows the functional and the behavioral information at these levels. Secondly, the functional view includes graphs of data/control flow, input/output, definition/use, variable/reference, etc. Each feature of the view contains different kind of functionality of the software. Thirdly, the behavioral view includes state diagrams, interleaved event lists, etc. This view shows the dynamic properties or the software at runtime. Beside these views, there are a number of other documents: capabilities, interfaces, comments, code, etc. One of the most powerful characteristics of this approach is the capability of abstracting and exploding these dimensional information in the architecture through navigation. These capabilities establish the foundation for scalable and modular understanding of the software. This approach allows engineers to extract reusable components from the software during reengineering process.

  • PDF

Development of the Information Delivery System for the Home Nursing Service (가정간호사업 운용을 위한 정보전달체계 개발 I (가정간호 데이터베이스 구축과 뇌졸중 환자의 가정간호 전산개발))

  • Park, J.H;Kim, M.J;Hong, K.J;Han, K.J;Park, S.A;Yung, S.N;Lee, I.S;Joh, H.;Bang, K.S
    • Journal of Home Health Care Nursing
    • /
    • v.4
    • /
    • pp.5-22
    • /
    • 1997
  • The purpose of the study was to development an information delivery system for the home nursing service, to demonstrate and to evaluate the efficiency of it. The period of research conduct was from September 1996 to August 31, 1997. At the 1st stage to achieve the purpose, Firstly Assessment tool for the patients with cerebral vascular disease who have the first priority of HNS among the patients with various health problems at home was developed through literature review. Secondly, after identification of patient nursing problem by the home care nurse with the assessment tool, the patient's classification system developed by Park (1988) that was 128 nursing activities under 6 categories was used to identify the home care nurse's activities of the patient with CAV at home. The research team had several workshops with 5 clinical nurse experts to refine it. At last 110 nursing activities under 11 categories for the patients with CVA were derived. At the second stage, algorithms were developed to connect 110 nursing activities with the patient nursing problems identified by assessment tool. The computerizing process of the algorithms is as follows: These algorithms are realized with the computer program by use of the software engineering technique. The development is made by the prototyping method, which is the requirement analysis of the software specifications. The basic features of the usability, compatibility, adaptability and maintainability are taken into consideration. Particular emphasis is given to the efficient construction of the database. To enhance the database efficiency and to establish the structural cohesion, the data field is categorized with the weight of relevance to the particular disease. This approach permits the easy adaptability when numerous diseases are applied in the future. In paralleled with this, the expandability and maintainability is stressed through out the program development, which leads to the modular concept. However since the disease to be applied is increased in number as the project progress and since they are interrelated and coupled each other, the expand ability as well as maintainability should be considered with a big priority. Furthermore, since the system is to be synthesized with other medical systems in the future, these properties are very important. The prototype developed in this project is to be evaluated through the stage of system testing. There are various evaluation metrics such as cohesion, coupling and adaptability so on. But unfortunately, direct measurement of these metrics are very difficult, and accordingly, analytical and quantitative evaluations are almost impossible. Therefore, instead of the analytical evaluation, the experimental evaluation is to be applied through the test run by various users. This system testing will provide the viewpoint analysis of the user's level, and the detail and additional requirement specifications arising from user's real situation will be feedback into the system modeling. Also. the degree of freedom of the input and output will be improved, and the hardware limitation will be investigated. Upon the refining, the prototype system will be used as a design template. and will be used to develop the more extensive system. In detail. the relevant modules will be developed for the various diseases, and the module will be integrated by the macroscopic design process focusing on the inter modularity, generality of the database. and compatibility with other systems. The Home care Evaluation System is comprised of three main modules of : (1) General information on a patient, (2) General health status of a patient, and (3) Cerebrovascular disease patient. The general health status module has five sub modules of physical measurement, vitality, nursing, pharmaceutical description and emotional/cognition ability. The CVA patient module is divided into ten sub modules such as subjective sense, consciousness, memory and language pattern so on. The typical sub modules are described in appendix 3.

  • PDF

A Study of Textured Image Segmentation using Phase Information (페이즈 정보를 이용한 텍스처 영상 분할 연구)

  • Oh, Suk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.2
    • /
    • pp.249-256
    • /
    • 2011
  • Finding a new set of features representing textured images is one of the most important studies in textured image analysis. This is because it is impossible to construct a perfect set of features representing every textured image, and it is inevitable to choose some relevant features which are efficient to on-going image processing jobs. This paper intends to find relevant features which are efficient to textured image segmentation. In this regards, this paper presents a different method for the segmentation of textured images based on the Gabor filter. Gabor filter is known to be a very efficient and effective tool which represents human visual system for texture analysis. Filtering a real-valued input image by the Gabor filter results in complex-valued output data defined in the spatial frequency domain. This complex value, as usual, gives the module and the phase. This paper focused its attention on the phase information, rather than the module information. In fact, the module information is considered very useful at region analysis in texture, while the phase information was considered almost of no use. But this paper shows that the phase information can also be fully useful and effective at region analysis in texture, once a good method introduced. We now propose "phase derivated method", which is an efficient and effective way to compute the useful phase information directly from the filtered value. This new method reduces effectively computing burden and widen applicable textured images.

Analysis of Image Processing Characteristics in Computed Radiography System by Virtual Digital Test Pattern Method (Virtual Digital Test Pattern Method를 이용한 CR 시스템의 영상처리 특성 분석)

  • Choi, In-Seok;Kim, Jung-Min;Oh, Hye-Kyong;Kim, You-Hyun;Lee, Ki-Sung;Jeong, Hoi-Woun;Choi, Seok-Yoon
    • Journal of radiological science and technology
    • /
    • v.33 no.2
    • /
    • pp.97-107
    • /
    • 2010
  • The objectives of this study is to figure out the unknown image processing methods of commercial CR system. We have implemented the processing curve of each Look up table(LUT) in REGIUS 150 CR system by using virtual digital test pattern method. The characteristic of Dry Imager was measured also. First of all, we have generated the virtual digital test pattern file with binary file editor. This file was used as an input data of CR system (REGIUS 150 CR system, KONICA MINOLTA). The DICOM files which were automatically generated output files by the CR system, were used to figure out the processing curves of each LUT modes (THX, ST, STM, LUM, BONE, LIN). The gradation curves of Dry Imager were also measured to figure out the characteristics of hard copy image. According to the results of each parameters, we identified the characteristics of image processing parameter in CR system. The processing curves which were measured by this proposed method showed the characteristics of CR system. And we found the linearity of Dry Imager in the middle area of processing curves. With these results, we found that the relationships between the curves and each parameters. The G value is related to the slope and the S value is related to the shift in x-axis of processing curves. In conclusion, the image processing method of the each commercial CR systems are different, and they are concealed. This proposed method which uses virtual digital test pattern can measure the characteristics of parameters for the image processing patterns in the CR system. We expect that the proposed method is useful to analogize the image processing means not only for this CR system, but also for the other commercial CR systems.

Analysis of Industrial Linkage Effects for Farm Land Base Development Project -With respect to the Hwangrak Benefited Area with Reservoir - (농업생산기반 정비사업의 산업연관효과분석 -황락 저수지지구를 중심으로-)

  • Lim, Jae Hwan;Han, Seok Ho
    • Korean Journal of Agricultural Science
    • /
    • v.26 no.2
    • /
    • pp.77-93
    • /
    • 1999
  • This study is aiming at identifying the foreward and backward lingkage effects of the farm land base development project. Korean Government has continuously carried out farmland base development projets including the integrated agricultural development projects. large and medium scale irrigation projects and the comprehensive development of the four big river basin including tidal land reclamation and estuary dam construction for the all weather farming since 1962. the starting year of the five year economic development plans. Consequently the irrigation rate of paddy fields in Korea reached to 75% in 1998 and to escalate the irrigation rate, the Government had procured heavy investment fund from IBRD. IMF and OECF etc. To cope with the agricultural problems like trade liberalization in accordance with WTO policy, the government has tried to solve such problems as new farmland base development policy, preservation of the farmland and expansion of farmland to meet self-sufficiency of foods in the future. Especially, farmland base development projects have been challanged to environmental and ecological problems in evaluating economic benefits and costs where the value of non-market goods have not been included in those. Up to data, in evaluating benefits and costs of the projects, farmland base development projects have been confined to direct incremental value of farm products and it's related costs. Therefore the projects'efficiency as a decision making criteria has shown the low level of economic efficiencies. In estimating economic efficiencies including Leontiefs input-output analysis of the projects could not be founded in Korea at present. Accordingly this study is aimed at achieving and identifying the following objectives. (1) To identify the problems related to the financial supports of the Government in implementing the proposed projects. (2) To estimated backward and foreward linkage effects of the proposed project from the view point of national economy as a whole. To achieve the objectives, Hwangrak benefited area with reservoir which is located in Seosan-haemi Disticts, Chungnam Province were selected as a case study. The main results of the study are summarized as follows : a. The present value of investment and O & M cost were amounted to 3,510million won and the present value of the value added in related industries was estimated at 5.913million won for the period of economic life of 70 years. b. The total discounted value of farm products in the concerned industries derived by the project was estimated at 10,495million won and the foreward and backward linkage effects of the project were amounted to 6,760 and 5,126million won respectively. c. The total number of employment opportunities derived from the related industries for the period of project life were 3,136 man/year. d. Farmland base development projects were showed that the backward linkage effects estimated by index of the sensitivity dispersion were larger than the forward linkage effect estimated by index of the power of dispersion. On the other hand, the forward linkage effect of rice production value during project life was larger than the backward linkage effect e. The rate of creation of new job opportunity by means of implementing civil engineering works were shown high in itself rather than any other fields. and the linkage effects of production of the project investment were mainly derived from the metal and non-metal fields. f. According to the industrial linkage effect analysis, farmland base development projects were identified economically feasible from the view point of national economy as a whole even though the economic efficiencies of the project was outstandingly decreased owing to delaying construction period and increasing project costs.

  • PDF

Optimization of Supercritical Water Oxidation(SCWO) Process for Decomposing Nitromethane (Nitromethane 분해를 위한 초임계수 산화(SCWO) 공정 최적화)

  • Han, Joo Hee;Jeong, Chang Mo;Do, Seung Hoe;Han, Kee Do;Sin, Yeong Ho
    • Korean Chemical Engineering Research
    • /
    • v.44 no.6
    • /
    • pp.659-668
    • /
    • 2006
  • The optimization of supercritical water oxidation (SCWO) process for decomposing nitromethane was studied by means of a design of experiments. The optimum operating region for the SCWO process to minimize COD and T-N of treated water was obtained in a lab scale unit. The authors had compared the results from a SCWO pilot plant with those from a lab scale system to explore the problems of scale-up of SCWO process. The COD and T-N in treated waters were selected as key process output variables (KPOV) for optimization, and the reaction temperature (Temp) and the mole ratio of nitromethane to ammonium hydroxide (NAR) were selected as key process input variables (KPIV) through the preliminary tests. The central composite design as a statistical design of experiments was applied to the optimization, and the experimental results were analyzed by means of the response surface method. From the main effects analysis, it was declared that COD of treated water steeply decreased with increasing Temp but slightly decreased with an increase in NAR, and T-N decreased with increasing both Temp and NAR. At lower Temp as $420{\sim}430^{\circ}C$, the T-N steeply decreased with an increase in NAR, however its variation was negligible at higher Temp above $450^{\circ}C$. The regression equations for COD and T-N were obtained as quadratic models with coded Temp and NAR, and they were confirmed with coefficient of determination ($r^2$) and normality of standardized residuals. The optimum operating region was defined as Temp $450-460^{\circ}C$ and NAR 1.03-1.08 by the intersection area of COD < 2 mg/L and T-N < 40 mg/L with regression equations and considering corrosion prevention. To confirm the optimization results and investigate the scale-up problems of SCWO process, the nitromethane was decomposed in a pilot plant. The experimental results from a SCWO pilot plant were compared with regression equations of COD and T-N, respectively. The results of COD and T-N from a pilot plant could be predicted well with regression equations which were derived in a lab scale SCWO system, although the errors of pilot plant data were larger than lab ones. The predictabilities were confirmed by the parity plots and the normality analyses of standardized residuals.