Search | Korea Science

A research on the key factors for classification of diabetes based on random forest

Shin, Yong sub;Lee, Namju;Hwang, Chigon
- International Journal of Internet, Broadcasting and Communication
- /
- v.12 no.3
- /
- pp.102-107
- /
- 2020
Recently, the number of people visiting the hospital is increasing due to diabetes. According to the Korean Diabetes Association, statistically, 1 in 7 adults over the age of 30 are suffering from diabetes. As such, diabetes is one of the most common diseases among modern people. In this paper, in addition to blood sugar, which is widely used for diabetes awareness, BMI, which is known to be related to diabetes, triglycerides and cholesterol that cause various complications in diabetics it was studied using random forest techniques and decision trees known to be effective for classification. The importance of each element was confirmed using the results and characteristic importance derived using two techniques. Through this, we studied the diabetes-related relationship between BMI, triglyceride, and cholesterol as well as blood sugar, a factor that diabetic patients should pay much attention to.
https://doi.org/10.7236/IJIBC.2020.12.3.102 인용 PDF KSCI

The Predictive QSAR Model for hERG Inhibitors Using Bayesian and Random Forest Classification Method

Kim, Jun-Hyoung;Chae, Chong-Hak;Kang, Shin-Myung;Lee, Joo-Yon;Lee, Gil-Nam;Hwang, Soon-Hee;Kang, Nam-Sook
- Bulletin of the Korean Chemical Society
- /
- v.32 no.4
- /
- pp.1237-1240
- /
- 2011
In this study, we have developed a ligand-based in-silico prediction model to classify chemical structures into hERG blockers using Bayesian and random forest modeling methods. These models were built based on patch clamp experimental results. The findings presented in this work indicate that Laplacian-modified naive Bayesian classification with diverse selection is useful for predicting hERG inhibitors when a large data set is not obtained.
https://doi.org/10.5012/bkcs.2011.32.4.1237 인용 PDF KSCI

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

Hwang, Wook-Yeon;Jun, Chi-Hyuck
- Industrial Engineering and Management Systems
- /
- v.13 no.4
- /
- pp.421-431
- /
- 2014
The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.
https://doi.org/10.7232/iems.2014.13.4.421 인용 PDF KSCI

Tree size determination for classification ensemble

Choi, Sung Hoon;Kim, Hyunjoong
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.1
- /
- pp.255-264
- /
- 2016
Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.
https://doi.org/10.7465/jkdi.2016.27.1.255 인용 PDF KSCI

Simple hypotheses testing for the number of trees in a random forest

Park, Cheol-Yong
- Journal of the Korean Data and Information Science Society
- /
- v.21 no.2
- /
- pp.371-377
- /
- 2010
In this study, we propose two informal hypothesis tests which may be useful in determining the number of trees in a random forest for use in classification. The first test declares that a case is 'easy' if the hypothesis of the equality of probabilities of two most popular classes is rejected. The second test declares that a case is 'hard' if the hypothesis that the relative difference or the margin of victory between the probabilities of two most popular classes is greater than or equal to some small number, say 0.05, is rejected. We propose to continue generating trees until all (or all but a small fraction) of the training cases are declared easy or hard. The advantage of combining the second test along with the first test is that the number of trees required to stop becomes much smaller than the first test only, where all (or all but a small fraction) of the training cases should be declared easy.
PDF KSCI

Basic Study on Safety Accident Prediction Model Using Random Forest in Construction Field (랜덤 포레스트 기법을 이용한 건설현장 안전재해 예측 모형 기초 연구)

Kang, Kyung-Su;Ryu, Han-Guk
- Proceedings of the Korean Institute of Building Construction Conference
- /
- 2018.11a
- /
- pp.59-60
- /
- 2018
The purpose of this study is to predict and classify the accident types based on the KOSHA (Korea Occupational Safety & Health Agency) and weather data. We also have an effort to suggest an important management method according to accident types by deriving feature importance. We designed two models based on accident data and weather data (model(a)) and only weather data (model(b)). As a result of random forest method, the model(b) showed a lack of accuracy in prediction. However, the model(a) presented more accurate prediction results than the model(b). Thus we presented safety management plan based on the results. In the future, this study will continue to carry out real time prediction to occurrence types to prevent safety accidents by supplementing the real time accident data and weather data.
PDF

Application and evaluation of machine-learning model for fire accelerant classification from GC-MS data of fire residue

Park, Chihyun;Park, Wooyong;Jeon, Sookyung;Lee, Sumin;Lee, Joon-Bae
- Analytical Science and Technology
- /
- v.34 no.5
- /
- pp.231-239
- /
- 2021
Detection of fire accelerants from fire residues is critical to determine whether the case was arson or accidental fire. However, to develop a standardized model for determining the presence or absence of fire accelerants was not easy because of high temperature which cause disappearance or combustion of components of fire accelerants. In this study, logistic regression, random forest, and support vector machine models were trained and evaluated from a total of 728 GC-MS analysis data obtained from actual fire residues. Mean classification accuracies of the three models were 63 %, 81 %, and 84 %, respectively, and in particular, mean AU-PR values of the three models were evaluated as 0.68, 0.86, and 0.86, respectively, showing fine performances of random forest and support vector machine models.
https://doi.org/10.5806/AST.2021.34.5.231 인용 PDF KSCI HTML

Feature Extraction of Non-proliferative Diabetic Retinopathy Using Faster R-CNN and Automatic Severity Classification System Using Random Forest Method

Jung, Younghoon;Kim, Daewon
- Journal of Information Processing Systems
- /
- v.18 no.5
- /
- pp.599-613
- /
- 2022
Non-proliferative diabetic retinopathy is a representative complication of diabetic patients and is known to be a major cause of impaired vision and blindness. There has been ongoing research on automatic detection of diabetic retinopathy, however, there is also a growing need for research on an automatic severity classification system. This study proposes an automatic detection system for pathological symptoms of diabetic retinopathy such as microaneurysms, retinal hemorrhage, and hard exudate by applying the Faster R-CNN technique. An automatic severity classification system was devised by training and testing a Random Forest classifier based on the data obtained through preprocessing of detected features. An experiment of classifying 228 test fundus images with the proposed classification system showed 97.8% accuracy.
https://doi.org/10.3745/JIPS.04.0252 인용 PDF KSCI

A Real-Time Sound Recognition System with a Decision Logic of Random Forest for Robots (Random Forest를 결정로직으로 활용한 로봇의 실시간 음향인식 시스템 개발)

Song, Ju-man;Kim, Changmin;Kim, Minook;Park, Yongjin;Lee, Seoyoung;Son, Jungkwan
- The Journal of Korea Robotics Society
- /
- v.17 no.3
- /
- pp.273-281
- /
- 2022
In this paper, we propose a robot sound recognition system that detects various sound events. The proposed system is designed to detect various sound events in real-time by using a microphone on a robot. To get real-time performance, we use a VGG11 model which includes several convolutional neural networks with real-time normalization scheme. The VGG11 model is trained on augmented DB through 24 kinds of various environments (12 reverberation times and 2 signal to noise ratios). Additionally, based on random forest algorithm, a decision logic is also designed to generate event signals for robot applications. This logic can be used for specific classes of acoustic events with better performance than just using outputs of network model. With some experimental results, the performance of proposed sound recognition system is shown on real-time device for robots.
https://doi.org/10.7746/jkros.2022.17.3.273 인용 PDF KSCI

Random Forest Classifier-based Ship Type Prediction with Limited Ship Information of AIS and V-Pass

Jeon, Ho-Kun;Han, Jae Rim
- Korean Journal of Remote Sensing
- /
- v.38 no.4
- /
- pp.435-446
- /
- 2022
Identifying ship types is an important process to prevent illegal activities on territorial waters and assess marine traffic of Vessel Traffic Services Officer (VTSO). However, the Terrestrial Automatic Identification System (T-AIS) collected at the ground station has over 50% of vessels that do not contain the ship type information. Therefore, this study proposes a method of identifying ship types through the Random Forest Classifier (RFC) from dynamic and static data of AIS and V-Pass for one year and the Ulsan waters. With the hypothesis that six features, the speed, course, length, breadth, time, and location, enable to estimate of the ship type, four classification models were generated depending on length or breadth information since 81.9% of ships fully contain the two information. The accuracy were average 96.4% and 77.4% in the presence and absence of size information. The result shows that the proposed method is adaptable to identifying ship types.
https://doi.org/10.7780/kjrs.2022.38.4.10 인용 PDF KSCI HTML

Search Result 1,014, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)