• Title/Summary/Keyword: Feature Variables

Search Result 366, Processing Time 0.027 seconds

A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers (공통요인분석자혼합모형의 요인점수를 이용한 일반화가법모형 기반 신용평가)

  • Lim, Su-Yeol;Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.235-245
    • /
    • 2012
  • Logistic discrimination is an useful statistical technique for quantitative analysis of financial service industry. Especially it is not only easy to be implemented, but also has good classification rate. Generalized additive model is useful for credit scoring since it has the same advantages of logistic discrimination as well as accounting ability for the nonlinear effects of the explanatory variables. It may, however, need too many additive terms in the model when the number of explanatory variables is very large and there may exist dependencies among the variables. Mixtures of factor analyzers can be used for dimension reduction of high-dimensional feature. This study proposes to use the low-dimensional factor scores of mixtures of factor analyzers as the new features in the generalized additive model. Its application is demonstrated in the classification of some real credit scoring data. The comparison of correct classification rates of competing techniques shows the superiority of the generalized additive model using factor scores.

Design of Regression Model and Pattern Classifier by Using Principal Component Analysis (주성분 분석법을 이용한 회귀다항식 기반 모델 및 패턴 분류기 설계)

  • Roh, Seok-Beom;Lee, Dong-Yoon
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.6
    • /
    • pp.594-600
    • /
    • 2017
  • The new design methodology of prediction model and pattern classification, which is based on the dimension reduction algorithm called principal component analysis, is introduced in this paper. Principal component analysis is one of dimension reduction techniques which are used to reduce the dimension of the input space and extract some good features from the original input variables. The extracted input variables are applied to the prediction model and pattern classifier as the input variables. The introduced prediction model and pattern classifier are based on the very simple regression which is the key point of the paper. The structural simplicity of the prediction model and pattern classifier leads to reducing the over-fitting problem. In order to validate the proposed prediction model and pattern classifier, several machine learning data sets are used.

Short-Term Water Quality Prediction of the Paldang Reservoir Using Recurrent Neural Network Models (순환신경망 모델을 활용한 팔당호의 단기 수질 예측)

  • Jiwoo Han;Yong-Chul Cho;Soyoung Lee;Sanghun Kim;Taegu Kang
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.1
    • /
    • pp.46-60
    • /
    • 2023
  • Climate change causes fluctuations in water quality in the aquatic environment, which can cause changes in water circulation patterns and severe adverse effects on aquatic ecosystems in the future. Therefore, research is needed to predict and respond to water quality changes caused by climate change in advance. In this study, we tried to predict the dissolved oxygen (DO), chlorophyll-a, and turbidity of the Paldang reservoir for about two weeks using long short-term memory (LSTM) and gated recurrent units (GRU), which are deep learning algorithms based on recurrent neural networks. The model was built based on real-time water quality data and meteorological data. The observation period was set from July to September in the summer of 2021 (Period 1) and from March to May in the spring of 2022 (Period 2). We tried to select an algorithm with optimal predictive power for each water quality parameter. In addition, to improve the predictive power of the model, an important variable extraction technique using random forest was used to select only the important variables as input variables. In both Periods 1 and 2, the predictive power after extracting important variables was further improved. Except for DO in Period 2, GRU was selected as the best model in all water quality parameters. This methodology can be useful for preventive water quality management by identifying the variability of water quality in advance and predicting water quality in a short period.

The Relationships among Students' Mapping Understanding, Mapping Errors and Cognitive/Affective Variables in Learning with Analogy (비유를 사용한 수업에서 학생들의 인지적.정의적 특성과 대응 이해 및 대응 오류 유형과의 관계)

  • Kim, Kyung-Sun;Hwang, Sun-Young;Noh, Tae-Hee
    • Journal of the Korean Chemical Society
    • /
    • v.54 no.1
    • /
    • pp.150-157
    • /
    • 2010
  • In this study, we investigated the differences of mapping understanding and the types of mapping errors by the levels of students' cognitive/affective variables and the relationships between mapping understanding and these variables in learning 'concentration and reaction rate' with analogy. After administering the tests regarding logical thinking ability, visual imagery ability, analogical reasoning ability, self efficacy, and need for cognition as pretests, students learned with analogy. Then, students' familiarity and mapping understanding were examined. Analyses of the results revealed that the scores of the mapping understanding for the students with higher levels of all cognitive/affective variables except visual imagery ability and familiarity were significantly higher than those for the students with lower levels. The differences in the types of the mapping errors such as overmapping, failure to map, impossible mapping, artificial mapping, mismapping, rash mapping, and retention of a base feature were also found by the levels of students' cognitive and affective variables. The scores of students' mapping understanding were positively correlated with those of all cognitive and affective variables. The results of multiple regression analysis indicated that students' science achievement, logical thinking ability, and familiarity were significant predictors of mapping understanding. Educational implications of these findings are discussed.

A Study on Determinants of Commercial Land Values in Gwangju City (광주시 상업지 지가의 형성요인에 관한 연구)

  • Lee, Hyun-Wook
    • Journal of the Korean association of regional geographers
    • /
    • v.2 no.2
    • /
    • pp.159-171
    • /
    • 1996
  • The aim of this study is which factors affect the commercial land values and how they act upon them through distribution of commercial land values by multiple regression analysis in Gwangju city. The major findings of this study are as follows: (1) The changes of commercial land values distribution in $1989{\sim}1996$, We see that the commercial area of higher land values extends following the main arterial road. This is related to urbanization in urban fringe while the decline of commercial land values occurs in city center with long history of commercial region. This is due to unsuitableness in rapid changes of commercial environment because of fragmented lots, old buildings. traffic congestion etc. (2) The regions where commercial land values greatly rose are the west in constructed the new planning city center of Sangmu-dong. and the south west in which is related to the extension of high density apartment and the location of big discount stores. (3) Through the changes in commercial land values distribution map. and road map, topographical map, we know that commercial land values is related to various factors; namely, distance from CBD, convenient traffic, reputation of commercial district, condition of a road, size of supplementary, a degree of commercial land use etc. (4) From the above related factor, six variables are extracted by operational definition. That is the spatial distance from the city center, the walking distance to a stopping place, the road width, the amount of bus traffic, the amount of pedestrian, the number of the shop. (5) Data of seven variables are collected on the highest values point of each Dong. We applicate multiple regression analysis with commercial land values as a dependent variable, extracted six variables as independent variables. (6) As a result of multiple regression on the determinants of commercial land values, the variables which is greatly related to commercial land values are the amount of pedestrain, the spatial distance from city center. We identify that two variables explain variance of the commercial land values by 65%. (7) In order to make clear about not explained 35%. we carry out analysis of residual. In consequence, we see small estimate in downtown area and large estimate in urban fringe. This feature is due to simple core structure of Gwangju city and limits of this regression model.

  • PDF

Improving Efficiency of Food Hygiene Surveillance System by Using Machine Learning-Based Approaches (기계학습을 이용한 식품위생점검 체계의 효율성 개선 연구)

  • Cho, Sanggoo;Cho, Seung Yong
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.53-67
    • /
    • 2020
  • This study employees a supervised learning prediction model to detect nonconformity in advance of processed food manufacturing and processing businesses. The study was conducted according to the standard procedure of machine learning, such as definition of objective function, data preprocessing and feature engineering and model selection and evaluation. The dependent variable was set as the number of supervised inspection detections over the past five years from 2014 to 2018, and the objective function was to maximize the probability of detecting the nonconforming companies. The data was preprocessed by reflecting not only basic attributes such as revenues, operating duration, number of employees, but also the inspections track records and extraneous climate data. After applying the feature variable extraction method, the machine learning algorithm was applied to the data by deriving the company's risk, item risk, environmental risk, and past violation history as feature variables that affect the determination of nonconformity. The f1-score of the decision tree, one of ensemble models, was much higher than those of other models. Based on the results of this study, it is expected that the official food control for food safety management will be enhanced and geared into the data-evidence based management as well as scientific administrative system.

A Research in the Characteristic of Arthritis Patienth (관절염환자(關節炎患者)의 특성(特性)에 대한 조사(調査) 연구(硏究))

  • Kang Jeam-Dug;Nam Chul-Hyun;Kim Gi-Yeol
    • Journal of Society of Preventive Korean Medicine
    • /
    • v.1 no.1
    • /
    • pp.149-165
    • /
    • 1997
  • In order that, investigating the feature of patients suffering arthritis, analysing its contents, and grasping a Primary factor affecting it, I might offerbasic datas which could help to plan and perform healthy affairs to thake precautions beforehand, I have investigated, analysed, and studied a total of 320 patients suffering arthritis, who have received physiotherapy in hospital located in Teaegu area for five months, from November 1 1995 to March 30 1996, of which summary and conclusion is this. 1. The general feature of patients in investigative objects In the distribution of the distinction of sex, men accounted for 26.9% and women, for 73.1%, and, in the fistribution of age, 60-year-old or more, most for 27.2% and from 20 to 29 years old, least for 14.0%. In the distinction of a vocation, housewives most accounted for 34.7% and students(jobless men), least for 19.3%. In the distinction of a matrimonial state, married persons most accounted for 76.7% and people living alone(divorce, separation by death, separation), least for 11.4%. In the distinction of an economic state, the middle classes most accounted for 73.5% and the upper classes, least for 2.9%. In the distinction of their academic careers, graduates of a primary school most accounted for 26.9% and graduates of university, for 14.1%, of which patients, having the ability to decode the national language, reached to 11.3%. In the distinction of the house form, people living in independent houses most accounted for 76.4% and residents in apartment(having an elevator), least for 9.4%. 2. In the distribution of the recurring state in the distinction of the feature, the recurring group was more than the group of patients falling that ill at first as 62.2% and in the distinction of the feature of the recurring group, the recurring group turned high in case of men being from 50s to 60s years old or more, people living alone (divorce, separation by death, separation), students (joblessmen), people working in farming, stockbeeding, forestry, fisheries, a simple labour, graduates of a primary school I having the ability to decode the national language, the upper classes, people part two years since they begined to suffer arthritis, people who had members having ever experienced arthritis among families. 3. In the distribution of arthritis on the distinction of bodily pars, a knee articulation most accounted for 50.2% and the articulation of fingers, for 8.8%, wile the simultaneous, several parts (multiple) accounted for 35.1%. In the distinction of the feature, arthritis of a knee turned high in case of men being from 20s to 30s years old, unmarried persons, people having academic careers of university, the middle classes, residents in apartment (having stairs). In the dictnction of a feature the case of several parts (multiple) turned high in case of women being from 50s to 60s years old or more, people living alone (divorce, separation by death, separation), people having the ability to decode th. national language, the graduates of a primary school, the upper classes, residents in apartment (having elevator). 4. In the distribution of arthritis on e distinction of a contracting term, two years or more most accounted for 51.6% and the case of contacting from one year to two years, for 15.3%. Analysing the distinction of the feature, the case of two years or more turned high in case of women being from 50s to 60s years old or more, people living alone (divorce separation by death, separation), the upper classes, people having the ability to decode the national language, residents in apartment (having elevator). 5. In the distribution of an treatment institution before patients came to help, their not curing most accounted for 39.1%, general, orthopedic, neurological surgery (physical therapy), for. 20.0%, and th. therapy of Chinese medicine (acupuncture, moxacautery, Chinese medicine), for 17.5%, and a pharmacy (medical therapy), for 13.4%. The case of patients not curing, in the distinction of a feature, turned high in case of men 20s years old, unmarried, the lower classes, people having academic careers of university, residents in apartment (having elevator). 6. In e distribution of the extent of satisfaction with treatment, common most accounted for 54.4% and some satisfaction, for 32.8%. The case of common, in the distinction of a feature, turned high, in case of men living alone from 50s to 60s years old (divorce, separation by death, separation), married persons, the upper classes, people having academic careers of university, residents in independent house, residents in apartment (having elevator), 7. In the distribution of the degree of knowledge of the cause of arthritis, patients knowing that the cause is to use very much a articulation in normal times most accounts for 60.1%, and patients knowing the state of short nutrition as a cause, for 2.5%. The case of patients knowing that the cause is to use very much in normal times, in the distinction of a feature, turned high in·case of ment being 20s and 60s years old or more, unmarried persons, e lower classes, people having the ability to decode. the national language, people having academic careers of university, residents in apartment (having stairs), 8. In the distribution of the state of physical exercise before arthritis contracted, patients exercising very much on the whole most accpimend for 40.3%, and patients not exercising, for 34.7%. The case of patients exercising very much on the whole, in the distinction of the feature, turned high in case of men being from 50s to 60s years old or more, people living alone(divorce, separation by death, separation), the lower classes, people having the ability to decode the national language, graduates of a primary school, residents in apartment (having elevator). 9. In the taste of patients suffering from arthritis, while the group of patients falling that ill at first and the recurring group didn't smoke cigarets, during alcohol and coffee on the whole, and the group of patients falling once again that ill drank a cup of distilled linquor and three cup of coffee or more on the whole per one day, and the group of patients falling that ill at first liked sort of vegetables and the recurring group liked very much sons of vegetables and fresh and meat in their loving food normal times. 10. Analysing the distribution on the dining table used by patients and the structure of a powder room, at first, in the structure of a powder room, the group of patients filling that ill have a toilet stool using as their sits, and a Bush toilet on the whole, and the recurring group, a toilet stool using as their sits and conventional type, and in the structure of a dinning table, the group of patients falling that ill at first and the recurring group turned high, each as 66.9% and 6.3%, who have a dining table carring here and there. 11. In the distribution of patients of arthritis in relation to stress, the case that they feeled severly symptoms of arthritis when thay got stress, turned high, each, as 78.6% in the recurring poop, and the case not knowing, as 61.5% in the first group. In the extent of stress normal times, the case that they got much stress on the whole turned high, each, as 72.4% in e recurring group, and the care that got less stress on the whole, as 60.0%. 12. In the distribution on the distinction of symptoms and impedimental extent, the recurring group turned high in each variable. Analysing the feature of the recurring group, in the distinction of symptoms, the case that they fooled much that the node of an articulation is stiff, turned high, as 71.6, and in the distinction of treatment before. patients came to helpk, the theraphy of Chinese medicine (physical theraphy), as 84.4%, the theraphy of Chinese medicine (acupuncture, moxacautery, Chinese medicine), as 73.2%, and in the distinction of the satisfing extent on treatment, the case of comman, as 72.3%, and in the cause of arthritis, the case not recruiting their health after a birth, as 68.5%, and the case not recovering wholely an articulation having got hurt, as 62.8%, and in the state of physical exercise before they begined suffering from arthritis, the case exercising very much on the whole, (as 74.2%), and in the extent of subjective impediment, the case of not being able to act almost, as 66.7%, the case of acting but feeling some hard, as 66.3%. 13. The correlation in variables in relation to arthritis Analysing realted variables, the recurring frequency showed correlation with such as the extent that patients got stress normal times, and the exercising state before suffering arthritis, and showed contra-correlation with academic careers, the wights, coffee. The cigaret, e loving food of taste, showed corralation with the weight, stature, alcohole as the loving food of taste. On the basis of this result medical members of heal, who are related to the regular education, public education or development of this program, should be concerned to prevent orthris.

  • PDF

Intelligent Simulation of Three-Dimensional Forging Process (삼차원 단조공정의 지능적 시뮬레이션)

  • Lee, M.C.;Joun, M.S.
    • Proceedings of the Korean Society for Technology of Plasticity Conference
    • /
    • 2007.05a
    • /
    • pp.155-159
    • /
    • 2007
  • We conduct intelligent simulation of three-dimensional forging processes in this paper. A new remeshing technique is employed for this purpose. Not only the state variables including strain and strain-rate but also the geometrical features including die-material contact conditions and the characteristic lines or surfaces are taken into account during remeshing. The presented approach is applied to the Baden-Baden benchmark test example and its influence on the simulated results is discussed particularly in terms of the deformed shape with emphasis on the characteristic line.

  • PDF

A Selection of Optimal EEG Channel for Emotion Analysis According to Music Listening using Stochastic Variables (확률변수를 이용한 음악에 따른 감정분석에의 최적 EEG 채널 선택)

  • Byun, Sung-Woo;Lee, So-Min;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.11
    • /
    • pp.1598-1603
    • /
    • 2013
  • Recently, researches on analyzing relationship between the state of emotion and musical stimuli are increasing. In many previous works, data sets from all extracted channels are used for pattern classification. But these methods have problems in computational complexity and inaccuracy. This paper proposes a selection of optimal EEG channel to reflect the state of emotion efficiently according to music listening by analyzing stochastic feature vectors. This makes EEG pattern classification relatively simple by reducing the number of dataset to process.

Analysis of Feature Variables for Breast Cancer Diagnosis

  • Jung, Yong Gyu;Kim, Jang Il;Sihn, Sung Chul;Heo, Jun
    • International journal of advanced smart convergence
    • /
    • v.2 no.2
    • /
    • pp.36-39
    • /
    • 2013
  • It is becoming more important as the growing of health information and increasing in cancer patients diagnose over the time gradually. Among the various types of cancer, we focuses on breast cancer diagnosis. The accuracy of breast cancer diagnosis is increasing when the diagnosis is based on evidence and statistics. To do this we use the weka data mining tools and analysis algorithms significantly associated with the decision tree uses rules. In addition, the data pre-processing and cross-validation are used to increase the reliability of the results. The number and cause of the disease becomes important to increase evidence-based medical doctors. As the evidence-based medical, the data obtained from patients in the past through the disease by calculating the probability for future patients to diagnose and predict disease and treatment plan. It can be found by improving the survival rate plays an important role.