• Title/Summary/Keyword: Multi-Class Classification

Search Result 231, Processing Time 0.027 seconds

Comparison Between Methods for Suitability Classification of Wild Edible Greens (산채류 재배적지 기준설정 방법 간의 비교 분석)

  • Hyun, Byung-Keun;Jung, Sug-Jae;Sonn, Yeon-Kyu;Park, Chan-Won;Zhang, Young-Seon;Song, Kwan-Cheol;Kim, Lee-Hyun;Choi, Eun-Young;Hong, Suk-Young;Kwon, Sun-Ik;Jang, Byoung-Choon
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.43 no.5
    • /
    • pp.696-704
    • /
    • 2010
  • The objective of this study was analysis of two methods of land suitability classification for wild edible green. One method was Maximum limiting factor method (MLFM) and the other was Multi-regression method (MRM) for land suitability classification for wild edible green. The investigation was carried out in Pyeongchang, Hongcheong, Hoeingseong, and Yanggu regions in Korea. The obtained results showed that factors related to the decision classification of the land suitability for wild edible green cultivation were land slope, altitude, soil morphology and gravel contents so on. The classification of the best suitability soil for wild edible greens were fine loamy (silty), valley or fan of soil morphology, well drainage class, B-slope (2~7%), available soil depth deeper than 100cm, and altitude higher than 501m. Contribution of soil that influence to crop yields using Multi-regression method were slope 0.30, altitude 0.22, soil morphology 0.13, drainage classes 0.09, available soil depth 0.07, and soil texture 0.01 orders. Using MLFM, area of best suitable land was 0.2%, suitable soil 15.0%, possible soil 16.7%, and low productive soil 68.0% in Hongcheon region of Gangwon province. But, area of best suitable land was 35.1%, suitable soil 30.7%, possible soil 10.3%, and low productive soil 23.9% by MRM. There was big difference of suitable soil area between two methods (MLFM and MRM). When decision classificatin of the land suitability for wild edible green cultivation should consider enough analysis methods. Furthermore, to establishment of land suitability classification for crop would be better use MRM than MLFM.

A screening of Alzheimer's disease using basis synthesis by singular value decomposition from Raman spectra of platelet (혈소판 라만 스펙트럼에서 특이값 분해에 의한 기저 합성을 통한 알츠하이머병 검출)

  • Park, Aaron;Baek, Sung-June
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.5
    • /
    • pp.2393-2399
    • /
    • 2013
  • In this paper, we proposed a method to screening of Alzheimer's disease (AD) from Raman spectra of platelet with synthesis of basis spectra using singular value decomposition (SVD). Raman spectra of platelet from AD transgenic mice are preprocessed with denoising, removal background and normalization method. The column vectors of each data matrix consist of Raman spectrum of AD and normal (NR). The matrix is factorized using SVD algorithm and then the basis spectra of AD and NR are determined by 12 column vectors of each matrix. The classification process is completed by select the class that minimized the root-mean-square error between the validation spectrum and the linear synthesized spectrum of the basis spectra. According to the experiments involving 278 Raman spectra, the proposed method gave about 97.6% classification rate, which is better performance about 6.1% than multi-layer perceptron (MLP) with extracted features using principle components analysis (PCA). The results show that the basis spectra using SVD is well suited for the diagnosis of AD by Raman spectra from platelet.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

Clinical Manifestation and Treatment Outcome of Lupus Nephritis in Children (소아 루프스 신염의 임상양상 및 치료결과)

  • Park Jee-Min;Shin Jae-Il;Kim Pyung-Kil;Lee Jae-Seung
    • Childhood Kidney Diseases
    • /
    • v.6 no.2
    • /
    • pp.155-168
    • /
    • 2002
  • Purpose; Systemic lupus erythematosus(SLE) is an autoimmune disease with multi-system involvement and renal damage is a major cause of morbidity and mortality in children. Renal involvement is more common and severe in children than in adults. Therefore, renal biopsy plays a crucial role in planning effective therapy. In this study, we investigated the clinical and pathological findings of lupus nephritis in children to aid clinical care of the disease. Methods: The clinical and pathological data of 40 patients who were diagnosed as SLE with renal involvement in Shinchon Severance Hospital from Jan. 1990 to Sep. 2002 were analyzed retrospectively. Results: The ratio of male to female patients was 1:3 and the median age at diagnosis was 12.1(2-18) years old. FANA(95.0%), anti-ds DNA antibody(87.5%), malar rash(80.0%) were the most common findings among the classification criteria by ARA. Microscopic hematuria with proteinuria(75.0%), nephrotic syndrome(55.0%), and microscopic hematuria alone(15.0%) were the most common renal presentations in the respective order at diagnosis. There were 27 cases with WHO class IV lupus nephritis confirmed by renal biopsy and 3 cases with pathological changes of WHO class type. Different treatment modalities were carried out : prednisolone only in 5 cases, prednisol-one+azat-hioprine in 9 cases, prednisolone+azathioprine+intravenous cyclophosphamide in 14 cases, prednisolone+cyclosporine A+intravenous cyclophosphamide in 12 cases, plasma exchange in 9 cases and intravenous gamma-globulin in 2 cases. The average follow-up period was $51.8{\pm}40.5$ months. During $51.8{\pm}40.5$ months. During follow-up, 4 patients expired. The risk factors associated with mortality were male, WHO class IV and acute renal failure at diagnosis. Conclusion: Renal involvement was noted in 63.5% of childhood SLE, and 67.5% of renal lesion was WHO class IV lupus nephritis which is known to be associated with a poor prognosis. Therefore aggressive treatment employing immunosuppressant during the early stages of disease could be helpful in improving long-term prognosis. But careful attention should be given to optimize the treatment due to unique problems associated with growth, psychosocial development and gonadal toxicity, especially in children.

  • PDF

Ensemble of Nested Dichotomies for Activity Recognition Using Accelerometer Data on Smartphone (Ensemble of Nested Dichotomies 기법을 이용한 스마트폰 가속도 센서 데이터 기반의 동작 인지)

  • Ha, Eu Tteum;Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.123-132
    • /
    • 2013
  • As the smartphones are equipped with various sensors such as the accelerometer, GPS, gravity sensor, gyros, ambient light sensor, proximity sensor, and so on, there have been many research works on making use of these sensors to create valuable applications. Human activity recognition is one such application that is motivated by various welfare applications such as the support for the elderly, measurement of calorie consumption, analysis of lifestyles, analysis of exercise patterns, and so on. One of the challenges faced when using the smartphone sensors for activity recognition is that the number of sensors used should be minimized to save the battery power. When the number of sensors used are restricted, it is difficult to realize a highly accurate activity recognizer or a classifier because it is hard to distinguish between subtly different activities relying on only limited information. The difficulty gets especially severe when the number of different activity classes to be distinguished is very large. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we take to dealing with this ten-class problem is to use the ensemble of nested dichotomy (END) method that transforms a multi-class problem into multiple two-class problems. END builds a committee of binary classifiers in a nested fashion using a binary tree. At the root of the binary tree, the set of all the classes are split into two subsets of classes by using a binary classifier. At a child node of the tree, a subset of classes is again split into two smaller subsets by using another binary classifier. Continuing in this way, we can obtain a binary tree where each leaf node contains a single class. This binary tree can be viewed as a nested dichotomy that can make multi-class predictions. Depending on how a set of classes are split into two subsets at each node, the final tree that we obtain can be different. Since there can be some classes that are correlated, a particular tree may perform better than the others. However, we can hardly identify the best tree without deep domain knowledge. The END method copes with this problem by building multiple dichotomy trees randomly during learning, and then combining the predictions made by each tree during classification. The END method is generally known to perform well even when the base learner is unable to model complex decision boundaries As the base classifier at each node of the dichotomy, we have used another ensemble classifier called the random forest. A random forest is built by repeatedly generating a decision tree each time with a different random subset of features using a bootstrap sample. By combining bagging with random feature subset selection, a random forest enjoys the advantage of having more diverse ensemble members than a simple bagging. As an overall result, our ensemble of nested dichotomy can actually be seen as a committee of committees of decision trees that can deal with a multi-class problem with high accuracy. The ten classes of activities that we distinguish in this paper are 'Sitting', 'Standing', 'Walking', 'Running', 'Walking Uphill', 'Walking Downhill', 'Running Uphill', 'Running Downhill', 'Falling', and 'Hobbling'. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window of the last 2 seconds, etc. For experiments to compare the performance of END with those of other methods, the accelerometer data has been collected at every 0.1 second for 2 minutes for each activity from 5 volunteers. Among these 5,900 ($=5{\times}(60{\times}2-2)/0.1$) data collected for each activity (the data for the first 2 seconds are trashed because they do not have time window data), 4,700 have been used for training and the rest for testing. Although 'Walking Uphill' is often confused with some other similar activities, END has been found to classify all of the ten activities with a fairly high accuracy of 98.4%. On the other hand, the accuracies achieved by a decision tree, a k-nearest neighbor, and a one-versus-rest support vector machine have been observed as 97.6%, 96.5%, and 97.6%, respectively.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Contactless User Identification System using Multi-channel Palm Images Facilitated by Triple Attention U-Net and CNN Classifier Ensemble Models

  • Kim, Inki;Kim, Beomjun;Woo, Sunghee;Gwak, Jeonghwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.33-43
    • /
    • 2022
  • In this paper, we propose an ensemble model facilitated by multi-channel palm images with attention U-Net models and pretrained convolutional neural networks (CNNs) for establishing a contactless palm-based user identification system using conventional inexpensive camera sensors. Attention U-Net models are used to extract the areas of interest including hands (i.e., with fingers), palms (i.e., without fingers) and palm lines, which are combined to generate three channels being ped into the ensemble classifier. Then, the proposed palm information-based user identification system predicts the class using the classifier ensemble with three outperforming pre-trained CNN models. The proposed model demonstrates that the proposed model could achieve the classification accuracy, precision, recall, F1-score of 98.60%, 98.61%, 98.61%, 98.61% respectively, which indicate that the proposed model is effective even though we are using very cheap and inexpensive image sensors. We believe that in this COVID-19 pandemic circumstances, the proposed palm-based contactless user identification system can be an alternative, with high safety and reliability, compared with currently overwhelming contact-based systems.

Research on ITB Contract Terms Classification Model for Risk Management in EPC Projects: Deep Learning-Based PLM Ensemble Techniques (EPC 프로젝트의 위험 관리를 위한 ITB 문서 조항 분류 모델 연구: 딥러닝 기반 PLM 앙상블 기법 활용)

  • Hyunsang Lee;Wonseok Lee;Bogeun Jo;Heejun Lee;Sangjin Oh;Sangwoo You;Maru Nam;Hyunsik Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.11
    • /
    • pp.471-480
    • /
    • 2023
  • The Korean construction order volume in South Korea grew significantly from 91.3 trillion won in public orders in 2013 to a total of 212 trillion won in 2021, particularly in the private sector. As the size of the domestic and overseas markets grew, the scale and complexity of EPC (Engineering, Procurement, Construction) projects increased, and risk management of project management and ITB (Invitation to Bid) documents became a critical issue. The time granted to actual construction companies in the bidding process following the EPC project award is not only limited, but also extremely challenging to review all the risk terms in the ITB document due to manpower and cost issues. Previous research attempted to categorize the risk terms in EPC contract documents and detect them based on AI, but there were limitations to practical use due to problems related to data, such as the limit of labeled data utilization and class imbalance. Therefore, this study aims to develop an AI model that can categorize the contract terms based on the FIDIC Yellow 2017(Federation Internationale Des Ingenieurs-Conseils Contract terms) standard in detail, rather than defining and classifying risk terms like previous research. A multi-text classification function is necessary because the contract terms that need to be reviewed in detail may vary depending on the scale and type of the project. To enhance the performance of the multi-text classification model, we developed the ELECTRA PLM (Pre-trained Language Model) capable of efficiently learning the context of text data from the pre-training stage, and conducted a four-step experiment to validate the performance of the model. As a result, the ensemble version of the self-developed ITB-ELECTRA model and Legal-BERT achieved the best performance with a weighted average F1-Score of 76% in the classification of 57 contract terms.

The Basic Data Analysis of Lupus Nephritis in Children (소아 루프스 신염에 대한 기초 조사)

  • Min Jae Hong;Paek Kyung Hoon;Park Kyung Mi;Kim Jung Sue;Ha Il Soo;Cheong Hae Il;Kim Joong Gon;Choi Yong
    • Childhood Kidney Diseases
    • /
    • v.3 no.1
    • /
    • pp.80-87
    • /
    • 1999
  • Purposes : Renal involvement is a potentially serious complication of systemic lupus erythematosus (SLE). There have been only few studies of lupus nephritis in pediatric age. In this study, the clinical manifestations, pathologic findings, response to treatment, and clinical course of lupus nephritis in children were analyzed. And the results will provide basic data for future nation-wide prospective multi-center study. Methods . The medical records of 46 children clinically and pathologically diagnosed to have lupus nephritis at Seoul National University Children's Hospital during 1986 to 1997 were analyzed retrospectively. Results : 1) The median age of diagnosis of lupus nephritis was 12.8 years ($2\;years\~\;15year$ 8months), and the sex ratio was 1:2.5. 2) FANA($85.7\%$), anti-ds-DNA antibody ($78.0\%$), and malar rash ($60.8\%$) were the most common findings among the classification criteria by ARA Decreased C3 was detected in $88.9\%$ of patients. 3) Hematuria ($87.0\%$) was the most common renal symptom, and WHO class IV lupus nephritis was identified in 41 cases by renal biopsy. 4) In most of patients, the disease activity was controlled relatively well with a single or combined therapy of prednisolone, azathioprine, or cyclophosphamide. The response revealed no difference according to the mode of treatment. 5) Infection, especially of Varicella-Zoster virus and candida, was the most common complication during the disease course. Conclusion : The renal involvement was noted in $87.0\%$ of childhood SLE, and $89.1\%$ of renal lesions was WHO class IV lupus nephritis known to associated with poor long-term prognosis. So, aggressive treatment using immunosuppressants in the early disease course may be helpful to increase long-term prognosis of lupus nephritis. A prospective multi-center study is necessary to analyze the therapeutic efficacy of various treatment modalities.

  • PDF

Research on Classification of Human Emotions Using EEG Signal (뇌파신호를 이용한 감정분류 연구)

  • Zubair, Muhammad;Kim, Jinsul;Yoon, Changwoo
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.821-827
    • /
    • 2018
  • Affective computing has gained increasing interest in the recent years with the development of potential applications in Human computer interaction (HCI) and healthcare. Although momentous research has been done on human emotion recognition, however, in comparison to speech and facial expression less attention has been paid to physiological signals. In this paper, Electroencephalogram (EEG) signals from different brain regions were investigated using modified wavelet energy features. For minimization of redundancy and maximization of relevancy among features, mRMR algorithm was deployed significantly. EEG recordings of a publically available "DEAP" database have been used to classify four classes of emotions with Multi class Support Vector Machine. The proposed approach shows significant performance compared to existing algorithms.