• Title/Summary/Keyword: Random Forest Classification

Search Result 311, Processing Time 0.026 seconds

Crop Yield Estimation Utilizing Feature Selection Based on Graph Classification (그래프 분류 기반 특징 선택을 활용한 작물 수확량 예측)

  • Ohnmar Khin;Sung-Keun Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1269-1276
    • /
    • 2023
  • Crop estimation is essential for the multinational meal and powerful demand due to its numerous aspects like soil, rain, climate, atmosphere, and their relations. The consequence of climate shift impacts the farming yield products. We operate the dataset with temperature, rainfall, humidity, etc. The current research focuses on feature selection with multifarious classifiers to assist farmers and agriculturalists. The crop yield estimation utilizing the feature selection approach is 96% accuracy. Feature selection affects a machine learning model's performance. Additionally, the performance of the current graph classifier accepts 81.5%. Eventually, the random forest regressor without feature selections owns 78% accuracy and the decision tree regressor without feature selections retains 67% accuracy. Our research merit is to reveal the experimental results of with and without feature selection significance for the proposed ten algorithms. These findings support learners and students in choosing the appropriate models for crop classification studies.

Study on the ensemble methods with kernel ridge regression

  • Kim, Sun-Hwa;Cho, Dae-Hyeon;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.375-383
    • /
    • 2012
  • The purpose of the ensemble methods is to increase the accuracy of prediction through combining many classifiers. According to recent studies, it is proved that random forests and forward stagewise regression have good accuracies in classification problems. However they have great prediction error in separation boundary points because they used decision tree as a base learner. In this study, we use the kernel ridge regression instead of the decision trees in random forests and boosting. The usefulness of our proposed ensemble methods was shown by the simulation results of the prostate cancer and the Boston housing data.

Medical Image Classification and Keyword Annotation Using Combination of Random Forests and Relation Weight (Random Forests와 관계 가중치 결합을 이용한 의료 영상 분류 및 주석 자동 생성)

  • Lee, Ji-hyun;Kim, Seong-hoon;Ko, Byoung-chul;Nam, Jae-Yeal
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.596-598
    • /
    • 2010
  • 본 논문에서는 의료영상 중 X-ray 영상을 대상으로 영상을 분류하고 분류 결과에 따라 다중 키워드를 생성하는 방법을 제시한다. X-ray영상은 대부분 그레이 영상임으로 Local Binary Patterns (LBP)을 이용하여 픽셀간의 연관성을 특징으로 추출하고, 실시간 학습 및 분류가 가능한 Random Forests 분류기로 영상들을 30개의 클래스로 분류한다. 또한, 미리 정의된 신체 부위간의 관계 가중치를 분류 스코어에 결합하여 신뢰값을 생성하고 이를 기반으로 영상에 대해 다중 주석을 부여하게 된다. 이렇게 부여된 다중 주석은 키워드 기반의 의료영상을 가능케 함으로 보다 쉽고 효율적인 검색 환경을 제공할 수 있다.

Analysis of Land Cover Changes Based on Classification Result Using PlanetScope Satellite Imagery

  • Yoon, Byunghyun;Choi, Jaewan
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.4
    • /
    • pp.671-680
    • /
    • 2018
  • Compared to the imagery produced by traditional satellites, PlanetScope satellite imagery has made it possible to easily capture remotely-sensed imagery every day through dozens or even hundreds of satellites on a relatively small budget. This study aimed to detect changed areas and update a land cover map using a PlanetScope image. To generate a classification map, pixel-based Random Forest (RF) classification was performed by using additional features, such as the Normalized Difference Water Index (NDWI) and the Normalized Difference Vegetation Index (NDVI). The classification result was converted to vector data and compared with the existing land cover map to estimate the changed area. To estimate the accuracy and trends of the changed area, the quantitative quality of the supervised classification result using the PlanetScope image was evaluated first. In addition, the patterns of the changed area that corresponded to the classification result were analyzed using the PlanetScope satellite image. Experimental results found that the PlanetScope image can be used to effectively to detect changed areas on large-scale land cover maps, and supervised classification results can update the changed areas.

Design of Fetal Health Classification Model for Hospital Operation Management (효율적인 병원보건관리를 위한 태아건강분류 모델)

  • Chun, Je-Ran
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.263-268
    • /
    • 2021
  • The purpose of this study was to propose a model which is suitable for the actual delivery system by designing a fetal delivery hospital operation management and fetal health classification model. The number of deaths during childbirth is similar to the number of maternal mortality rate of 295,000 as of 2017. Among those numbers, 94% of deaths are preventable in most cases. Therefore, in this paper, we proposed a model that predicts the health condition of the fetus using data like heart rate of fetuses, fetal movements, uterine contractions, etc. that are extracted from the Cardiotocograms(CTG) test using a random forest. If the redundancy of the data is unbalanced, This proposed model guarantees a stable management of the fetal delivery health management system. To secure the accuracy of the fetal delivery health management system, we remove the outlier which embedded in the system, by setting thresholds for the upper and lower standard deviations. In addition, as the proportion of the sequence class uses the health status of fetus, a small number of classes were replicated by data-resampling to balance the classes. We had the 4~5% improvement and as the result we reached the accuracy of 97.75%. It is expected that the developed model will contribute to prevent death and effective fetal health management, also disease prevention by predicting and managing the fetus'deaths and diseases accurately in advance.

Classification of Transport Vehicle Noise Events in Magnetotelluric Time Series Data in an Urban area Using Random Forest Techniques (Random Forest 기법을 이용한 도심지 MT 시계열 자료의 차량 잡음 분류)

  • Kwon, Hyoung-Seok;Ryu, Kyeongho;Sim, Ickhyeon;Lee, Choon-Ki;Oh, Seokhoon
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.4
    • /
    • pp.230-242
    • /
    • 2020
  • We performed a magnetotelluric (MT) survey to delineate the geological structures below the depth of 20 km in the Gyeongju area where an earthquake with a magnitude of 5.8 occurred in September 2016. The measured MT data were severely distorted by electrical noise caused by subways, power lines, factories, houses, and farmlands, and by vehicle noise from passing trains and large trucks. Using machine-learning methods, we classified the MT time series data obtained near the railway and highway into two groups according to the inclusion of traffic noise. We applied three schemes, stochastic gradient descent, support vector machine, and random forest, to the time series data for the highspeed train noise. We formulated three datasets, Hx, Hy, and Hx & Hy, for the time series data of the large truck noise and applied the random forest method to each dataset. To evaluate the effect of removing the traffic noise, we compared the time series data, amplitude spectra, and apparent resistivity curves before and after removing the traffic noise from the time series data. We also examined the frequency range affected by traffic noise and whether artifact noise occurred during the traffic noise removal process as a result of the residual difference.

Classification of Diabetic Retinopathy using Mask R-CNN and Random Forest Method

  • Jung, Younghoon;Kim, Daewon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.29-40
    • /
    • 2022
  • In this paper, we studied a system that detects and analyzes the pathological features of diabetic retinopathy using Mask R-CNN and a Random Forest classifier. Those are one of the deep learning techniques and automatically diagnoses diabetic retinopathy. Diabetic retinopathy can be diagnosed through fundus images taken with special equipment. Brightness, color tone, and contrast may vary depending on the device. Research and development of an automatic diagnosis system using artificial intelligence to help ophthalmologists make medical judgments possible. This system detects pathological features such as microvascular perfusion and retinal hemorrhage using the Mask R-CNN technique. It also diagnoses normal and abnormal conditions of the eye by using a Random Forest classifier after pre-processing. In order to improve the detection performance of the Mask R-CNN algorithm, image augmentation was performed and learning procedure was conducted. Dice similarity coefficients and mean accuracy were used as evaluation indicators to measure detection accuracy. The Faster R-CNN method was used as a control group, and the detection performance of the Mask R-CNN method through this study showed an average of 90% accuracy through Dice coefficients. In the case of mean accuracy it showed 91% accuracy. When diabetic retinopathy was diagnosed by learning a Random Forest classifier based on the detected pathological symptoms, the accuracy was 99%.

The Investigation of Employing Supervised Machine Learning Models to Predict Type 2 Diabetes Among Adults

  • Alhmiedat, Tareq;Alotaibi, Mohammed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2904-2926
    • /
    • 2022
  • Currently, diabetes is the most common chronic disease in the world, affecting 23.7% of the population in the Kingdom of Saudi Arabia. Diabetes may be the cause of lower-limb amputations, kidney failure and blindness among adults. Therefore, diagnosing the disease in its early stages is essential in order to save human lives. With the revolution in technology, Artificial Intelligence (AI) could play a central role in the early prediction of diabetes by employing Machine Learning (ML) technology. In this paper, we developed a diagnosis system using machine learning models for the detection of type 2 diabetes among adults, through the adoption of two different diabetes datasets: one for training and the other for the testing, to analyze and enhance the prediction accuracy. This work offers an enhanced classification accuracy as a result of employing several pre-processing methods before applying the ML models. According to the obtained results, the implemented Random Forest (RF) classifier offers the best classification accuracy with a classification score of 98.95%.

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • v.14 no.3
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

Classification of Network Traffic using Machine Learning for Software Defined Networks

  • Muhammad Shahzad Haroon;Husnain Mansoor
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.91-100
    • /
    • 2023
  • As SDN devices and systems hit the market, security in SDN must be raised on the agenda. SDN has become an interesting area in both academics and industry. SDN promises many benefits which attract many IT managers and Leading IT companies which motivates them to switch to SDN. Over the last three decades, network attacks becoming more sophisticated and complex to detect. The goal is to study how traffic information can be extracted from an SDN controller and open virtual switches (OVS) using SDN mechanisms. The testbed environment is created using the RYU controller and Mininet. The extracted information is further used to detect these attacks efficiently using a machine learning approach. To use the Machine learning approach, a dataset is required. Currently, a public SDN based dataset is not available. In this paper, SDN based dataset is created which include legitimate and non-legitimate traffic. Classification is divided into two categories: binary and multiclass classification. Traffic has been classified with or without dimension reduction techniques like PCA and LDA. Our approach provides 98.58% of accuracy using a random forest algorithm.