• Title/Summary/Keyword: Classification Variables

Search Result 939, Processing Time 0.024 seconds

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

Community Classification of Forest Vegetation in Mt. Myeonsan (면산(綿山) 일대(一帶) 산림식생(山林植生)의 군락분류(群落分類)에 관(關)한 연구(硏究))

  • Lee, Byung Chun;Yun, Chung Weon;Shin, Joon Hwan;Oh, Jeong Soo
    • Journal of Korean Society of Forest Science
    • /
    • v.90 no.4
    • /
    • pp.548-557
    • /
    • 2001
  • This study was carried out to classify forest vegetation of Mt. Myeonsan with phytosociological analysis of ZM School and to explain community types with CCA ordination. The research sites were located in the northern part of Kyungsangpook-Do area and in the southern part of Kangwon-Do. The 73 plots were investigated in the study areas which were including 291 plant species. The forest vegetation was classified into Quercus mongolica community group, Juglans mandshurica community group, and Larix leptolepis plantation. Q. mongolica community group was divided into 9 communities such as Rhododendron schlippenbachii community, Symplocos paniculata community, Pinus densiflora community, Betula schmidtii community, Fraxinus rhynchophylla community, Populus davidiana community, Tilia amurensis community, Betula costata community and Betula davurica community. J. mandshurica community group was divided into 5 communities and 4 groups such as Acer mono community (subdivided into Quercus serrata group and typical group), Fraxinus mandshurica community, Ulmus laciniata community (subdivided into typical group and Ribes mandshuricum group), Malus baccata community and Carpinus laxiflora community. And it was entirely classified into 17 vegetation units. According to CCA, Q. mongolica community group showed high positive correlation to topography, altitude and slope degree, and J. mandshurica community group showed high positive correlation to bare rock. However, L. leptolepis community group was not significantly ordinated with the gradient of environmental factors. And in correlation of axes, important environmental variables were topography and altitude.

  • PDF

The guideline for choosing the right-size of tree for boosting algorithm (부스팅 트리에서 적정 트리사이즈의 선택에 관한 연구)

  • Kim, Ah-Hyoun;Kim, Ji-Hyun;Kim, Hyun-Joong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.949-959
    • /
    • 2012
  • This article is to find the right size of decision trees that performs better for boosting algorithm. First we defined the tree size D as the depth of a decision tree. Then we compared the performance of boosting algorithm with different tree sizes in the experiment. Although it is an usual practice to set the tree size in boosting algorithm to be small, we figured out that the choice of D has a significant influence on the performance of boosting algorithm. Furthermore, we found out that the tree size D need to be sufficiently large for some dataset. The experiment result shows that there exists an optimal D for each dataset and choosing the right size D is important in improving the performance of boosting. We also tried to find the model for estimating the right size D suitable for boosting algorithm, using variables that can explain the nature of a given dataset. The suggested model reveals that the optimal tree size D for a given dataset can be estimated by the error rate of stump tree, the number of classes, the depth of a single tree, and the gini impurity.

Minimum Weight Design for Web Frames of Cargo Tanks in the LPG Carrier (LPG 운반선 화물창의 웨브 프레임 최소중량설계)

  • Park, Myeong-Chul;Shin, Sang-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.103-108
    • /
    • 2020
  • Generally, the cargo tank of LPG carriers corresponds to an independent tank Type A defined by the International Maritime Organization (IMO). The outside of the tank is insulated by polyurethane foam, and the tank is made of expensive low temperature steel that can withstand temperatures as low as -50℃. The cargo tank is composed of outer shell plates, bulkheads, stiffeners, web frames, and stringers. Among them, the outer shell plates, bulkheads, and stiffeners can be designed without structural analysis by the Classification Rules and are constructed easily through optimal design. On the other hand, optimal design, including numerous structural analysis, is not performed because web frames and stringers should be designed and approved through structural analysis. Only adequate design, which determines the design dimensions through several dozen structural analysis, is performed. In this study, for finite element analysis, eight loading conditions were applied, and the deformation of the entire ship for each loading condition was considered. The minimum weight design was performed for the web frames of cargo tanks in the 82,000 ㎥ LNG carrier through the gradient-based optimization technique, and the weight was reduced by approximately 108 tons per ship.

Predicting Highway Concrete Pavement Damage using XGBoost (XGBoost를 활용한 고속도로 콘크리트 포장 파손 예측)

  • Lee, Yongjun;Sun, Jongwan
    • Korean Journal of Construction Engineering and Management
    • /
    • v.21 no.6
    • /
    • pp.46-55
    • /
    • 2020
  • The maintenance cost for highway pavement is gradually increasing due to the continuous increase in road extension as well as increase in the number of old routes that have passed the public period. As a result, there is a need for a method of minimizing costs through preventative grievance Preventive maintenance requires the establishment of a strategic plan through accurate prediction old Highway pavement. herefore, in this study, the XGBoost among machine learning classification-based models was used to develop a highway pavement damage prediction model. First, we solved the imbalanced data issue through data sampling, then developed a predictive model using the XGBoost. This predictive model was evaluated through performance indicators such as accuracy and F1 score. As a result, the over-sampling method showed the best performance result. On the other hand, the main variables affecting road damage were calculated in the order of the number of years of service, ESAL, and the number of days below the minimum temperature -2 degrees Celsius. If the performance of the prediction model is improved through more data accumulation and detailed data pre-processing in the future, it is expected that more accurate prediction of maintenance-required sections will be possible. In addition, it is expected to be used as important basic information for estimating the highway pavement maintenance budget in the future.

Diagnosis of Diabetes Using Voltage Analysis Based on EIS (Electro Interstitial Scan) (EIS 기반 전압신호 분석을 통한 당뇨병 진단 가능성 평가)

  • Bae, Jang-Han;Kim, Soochan;Kaewkannate, Kanitthika;Jun, Min-Ho;Kim, Jaeuk U.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.11
    • /
    • pp.114-122
    • /
    • 2016
  • EIS (Electro interstitial scan) is a non-invasive and simple method to find the physio-pathological information inferred by electric current response with respect to low direct current applied between remote sites of the body. Although a few EIS-based devices for diagnosing diabetes were commercialized, they were not successful in offering clinical validity nor in confirming diagnostic principle. In this study, we measured the voltage responses of diabetic patients and normal subjects with a commercialized EIS device to test the usefulness of EIS in screening diabetes. For this purpose, voltage was measured between pairs of electrodes contacted at both palm, both soles of the feet and left and right forehead above both eyes. After feature extraction of voltage signals, the AUC (area under the curve) between the two groups was calculated and we found that seven variables were appropriately shown above 60% of accuracy. In addition, we applied the k-NN (k-nearest neighbors) method and found that the accuracy of classification between the two groups reached the accuracy of 76.2%. This result implies that the voltage response analysis based on EIS has potential as a diabetics screening method.

The impact of social support and job-related tension on burnout of social workers in public sector (사회복지 전담공무원들의 소진에 영향을 미치는 요인에 관한 연구)

  • Sun, Su-Kyung;Lim, Hyun-Sung;Kang, Sung-Ok
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.7 no.1
    • /
    • pp.135-144
    • /
    • 2012
  • The purpose of this research study is to find factors associated with burnout in social workers in public sector. This study examined the impact of social support and job-related tension on burnout of social workers in public sector. A participant of 300 social workers in public sector were recruited from Inchen city and completed the questionnaires during August, 2011. Data were entered into simple regression and hierarchical multiple regression analyses. First, women had a little higher mean score of burnout than men. However, this was not a statistically significant difference. Job Classification and year of work experiences were very important predictors for burn-out of social workers in public sector. Second, there was a strong positive relationship between job-related tension and burnout. Third, there was a strong negative relationship between social support and burnout. As the perception about social support by social workers increased, the level of burnout tended to decrease. Finally, the set of analyses examined the impact of social support after controlling for effects of demographic variables and job-related tension. Perceived social support, job-related tension made a significant contribution in burnout. The results of this research study provide several implications for social work practices. More research is needed regarding the effects of social support and job-related tension on burnout of social workers in public sector. For social works in public sector, supervision from supervisors appeared to have a protective effect against social workers' burnout.

  • PDF

The Relationship between Perceived Stress Scale and Carotid Intima-Media Thickness Using Ultrasonography in University Students (대학생들의 스트레스 지각 정도와 경동맥 내막-중막 두께의 관계 연구)

  • Kim, Dae-Sik;Sung, Hyun-Ho;Lee, Jong-Woo;Cho, Eun-Kyung
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.49 no.3
    • /
    • pp.308-315
    • /
    • 2017
  • Carotid artery intima-media thickness (CIMT) test is a standardized ultrasound procedure that screens for and monitors atherosclerosis in a safe and non-invasive manner, even in individuals with no symptoms of heart disease. The purpose of this study was to research the relationship between CIMT, as measured by ultrasonography, and the degree of individual perceived stress. There was a total of 168 (male: 46, female: 122) university students in Suwon-si, Gyeonggi-do, Korea, without other diseases. Written informed consent was obtained from all participants. CIMT was measured in the right and left common carotid arteries 1 cm below the bifurcation, and internal carotid artery, using non-invasive high-resolution Medison Accuvix V20 prestige ultrasound equipment and perceived stress, was measured with the 10-item PSS (Perceived Stress Scale) questionnaire. This study indicates that PSS may be appropriate to estimate perceived stress levels and weight. Failure to control stress anxiety could lead to weight and CIMT soaring to dangerous levels, resulting in a myocardial infarction or cerebrovascular accident. There was a statistically significant difference between left CIMT (p<0.05) and weight (p<0.05) according to each stress classification. The results of this study suggest that the weight measurement is influenced in part by one's generalized stress levels. Future continuous studies should be conducted to test the influences of perceived stress and generalized anxiety on CIMT when these are many continuous variables.

Development of a Nursing Fee Schedule Model (적정간호수가 산정모형 개발을 위한 연구)

  • 조소영;박정호
    • Journal of Korean Academy of Nursing
    • /
    • v.23 no.1
    • /
    • pp.68-89
    • /
    • 1993
  • This study was conducted to develop a model of a fee schedule for nursing services.'Regardless of the demand for skilled and professional nursing service today, the Korean health insurance system does not furnish a chapter for the nursing service fee schedule. A nation-wide survey of hospital nursing service fee schedules was to provide practical and realistic data about how the variety of nursing services are being charged. From September 1990 to April 1991, data from the fee schedule used by twenty hospitals located in eight large cities which are designated large medical regions in the Korea Health Care and Patient Referral System were collected. Nursing services and the fees charged for them were analyzed. The nursing services were subjected to a secondary analysis with referrence to reports on “nursing services to be charged in Korea”. The total number of nursing services recommended by the literatures was 177 : finally 141 types of nursing services were selected by investigator as chargable nursing services. In addition, data on managerial characteristics of the hospitals were collected to discover influential variables for a nursing fee schedule model. Under the assumption that all the managerial characteristics of the hospitals influenced the fee schedule, the following model was tested : Fee of nursing services (C) = f(A₁, A₂, A₃, A₄, A/sub 5/, A/sub 6/, A/sub 7/, A/sub 8/,) When, A₁ = number of nurses A₂ = the first salary of a nurse educated in a four year A₃ = scale of nursing management division A₄ = location of the hospital A/sub 5/ = the type of hospital management (profit / non-profit) A/sub 6/ = number of hospital beds A/sub 7/ = years of hospital operation A/sub 8/ = number and kinds of clinical divisions The results showed that the model should be built as follows : C = f (A₁, A/sub 4/, A/sub 5/) Each nursing service was applied to the fee schedule with consideration for the professional level and time-taken to provide the services. Detailed fee schedules were presented in the related tables. Of the 141 kinds of nursing services, 24.8% were chargeble to the Korea Health Insurance, 32.6% of the nursing services were being paid directly by the patienty. The rest of nursing services (42.6%) were not being charged to any source. It was recommened that the Korea Health Insurance Reimbursement system should add a classification system for nursing services that can be used in the national health care program. Further study is needed about how to include 32.6% of the nursing services now being paid for directly by the patients in the health insurance system.

  • PDF

BORDERLINE DISORDER OF CHILDHOOD : 8 CASES (아동기 경계선 장애 : 8증례)

  • Hong, Kang-E;Lee, Jeong-Seop;Shin, Min-Sup
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.6 no.1
    • /
    • pp.3-17
    • /
    • 1995
  • The so-called borderline children are characterised by disturbances in the sense of reality and interpersonal relationships, lack of control, fluctuation of functioning, uneven development and excessive anxiety. But the concept of borderline disorder of childhood is very difficult to define and diagnose in current classification system. The present study adapted the consensus symptoms in borderline children by Bemporad and Vera eight cases aged 7-11 were examined in 37 variables. Results are as follows 1) All subjects are boys and girl hardly be diagnosis n current diagnostic system and have many concurrent diagnoses. Common chief complaints in the sense of reality. 2) In KEDI-WISC test, the borderline children showed average intelligence, but performance IQ tends to be higher than verbal IQ. In Rorscharch test, they showed high thought disorder index, emotional instabilities and aggressive impulses. The results of TOVA suggested attentional deficit in half of the subjects. The organicity is not prominent. 3) Many of the borderline children were unwanted baby. Although primary care takers of all the subjects were their mothers there were moderate problems in caring attitude of their children and marital relationship with their husband. Sccioeconomic status was generally below middle class. Most of all subjects have delayed language development, but have overcome subsequently. Many subjects were rejected by peers because of their aggression. 4) The first visit of the subjects was about 6 years of age. Average duration of treatment was 2 years. All of them were treated in the outpatient basis except one. The effect of pharmacotherapy was doubtful and the necessity of long term play therapy was suggested. Although there were many limitations of method in present study, it was suggested that further research is needed for diagnostic criteria, epidemiology and treatment.

  • PDF