• Title/Summary/Keyword: Random Forest Classification

Search Result 311, Processing Time 0.038 seconds

Inhalation Configuration Detection for COVID-19 Patient Secluded Observing using Wearable IoTs Platform

  • Sulaiman Sulmi Almutairi;Rehmat Ullah;Qazi Zia Ullah;Habib Shah
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1478-1499
    • /
    • 2024
  • Coronavirus disease (COVID-19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus. COVID-19 become an active epidemic disease due to its spread around the globe. The main causes of the spread are through interaction and transmission of the droplets through coughing and sneezing. The spread can be minimized by isolating the susceptible patients. However, it necessitates remote monitoring to check the breathing issues of the patient remotely to minimize the interactions for spread minimization. Thus, in this article, we offer a wearable-IoTs-centered framework for remote monitoring and recognition of the breathing pattern and abnormal breath detection for timely providing the proper oxygen level required. We propose wearable sensors accelerometer and gyroscope-based breathing time-series data acquisition, temporal features extraction, and machine learning algorithms for pattern detection and abnormality identification. The sensors provide the data through Bluetooth and receive it at the server for further processing and recognition. We collect the six breathing patterns from the twenty subjects and each pattern is recorded for about five minutes. We match prediction accuracies of all machine learning models under study (i.e. Random forest, Gradient boosting tree, Decision tree, and K-nearest neighbor. Our results show that normal breathing and Bradypnea are the most correctly recognized breathing patterns. However, in some cases, algorithm recognizes kussmaul well also. Collectively, the classification outcomes of Random Forest and Gradient Boost Trees are better than the other two algorithms.

Analysis of facial expression recognition (표정 분류 연구)

  • Son, Nayeong;Cho, Hyunsun;Lee, Sohyun;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.5
    • /
    • pp.539-554
    • /
    • 2018
  • Effective interaction between user and device is considered an important ability of IoT devices. For some applications, it is necessary to recognize human facial expressions in real time and make accurate judgments in order to respond to situations correctly. Therefore, many researches on facial image analysis have been preceded in order to construct a more accurate and faster recognition system. In this study, we constructed an automatic recognition system for facial expressions through two steps - a facial recognition step and a classification step. We compared various models with different sets of data with pixel information, landmark coordinates, Euclidean distances among landmark points, and arctangent angles. We found a fast and efficient prediction model with only 30 principal components of face landmark information. We applied several prediction models, that included linear discriminant analysis (LDA), random forests, support vector machine (SVM), and bagging; consequently, an SVM model gives the best result. The LDA model gives the second best prediction accuracy but it can fit and predict data faster than SVM and other methods. Finally, we compared our method to Microsoft Azure Emotion API and Convolution Neural Network (CNN). Our method gives a very competitive result.

A Risk Classification Based Approach for Android Malware Detection

  • Ye, Yilin;Wu, Lifa;Hong, Zheng;Huang, Kangyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.2
    • /
    • pp.959-981
    • /
    • 2017
  • Existing Android malware detection approaches mostly have concentrated on superficial features such as requested or used permissions, which can't reflect the essential differences between benign apps and malware. In this paper, we propose a quantitative calculation model of application risks based on the key observation that the essential differences between benign apps and malware actually lie in the way how permissions are used, or rather the way how their corresponding permission methods are used. Specifically, we employ a fine-grained analysis on Android application risks. We firstly classify application risks into five specific categories and then introduce comprehensive risk, which is computed based on the former five, to describe the overall risk of an application. Given that users' risk preference and risk-bearing ability are naturally fuzzy, we design and implement a fuzzy logic system to calculate the comprehensive risk. On the basis of the quantitative calculation model, we propose a risk classification based approach for Android malware detection. The experiments show that our approach can achieve high accuracy with a low false positive rate using the RandomForest algorithm.

Developing an Intrusion Detection Framework for High-Speed Big Data Networks: A Comprehensive Approach

  • Siddique, Kamran;Akhtar, Zahid;Khan, Muhammad Ashfaq;Jung, Yong-Hwan;Kim, Yangwoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4021-4037
    • /
    • 2018
  • In network intrusion detection research, two characteristics are generally considered vital to building efficient intrusion detection systems (IDSs): an optimal feature selection technique and robust classification schemes. However, the emergence of sophisticated network attacks and the advent of big data concepts in intrusion detection domains require two more significant aspects to be addressed: employing an appropriate big data computing framework and utilizing a contemporary dataset to deal with ongoing advancements. As such, we present a comprehensive approach to building an efficient IDS with the aim of strengthening academic anomaly detection research in real-world operational environments. The proposed system has the following four characteristics: (i) it performs optimal feature selection using information gain and branch-and-bound algorithms; (ii) it employs machine learning techniques for classification, namely, Logistic Regression, Naïve Bayes, and Random Forest; (iii) it introduces bulk synchronous parallel processing to handle the computational requirements of large-scale networks; and (iv) it utilizes a real-time contemporary dataset generated by the Information Security Centre of Excellence at the University of Brunswick (ISCX-UNB) to validate its efficacy. Experimental analysis shows the effectiveness of the proposed framework, which is able to achieve high accuracy, low computational cost, and reduced false alarms.

Financial Instruments Recommendation based on Classification Financial Consumer by Text Mining Techniques (비정형 데이터 분석을 통한 금융소비자 유형화 및 그에 따른 금융상품 추천 방법)

  • Lee, Jaewoong;Kim, Young-Sik;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.4
    • /
    • pp.1-24
    • /
    • 2016
  • With the innovation of information technology, non-face-to-face robo advisor with high accessibility and convenience is spreading. The current robot advisor recommends appropriate investment products after understanding the investment propensity based on the structured data entered directly or indirectly by individuals. However, it is an inconvenient and obtrusive way for financial consumers to inquire or input their own subjective propensity to invest. Hence, this study proposes a way to deduce the propensity to invest in unstructured data that customers voluntarily exposed during consultation or online. Since prediction performance based on unstructured document differs according to the characteristics of text, in this study, classification algorithm optimized for the characteristic of text left by financial consumers is selected by performing prediction performance evaluation of various learning discrimination algorithms and proposed an intelligent method that automatically recommends investment products. User tests were given to MBA students. After showing the recommended investment and list of investment products, satisfaction was asked. Financial consumers' satisfaction was measured by dividing them into investment propensity and recommendation goods. The results suggest that the users high satisfaction with investment products recommended by the method proposed in this paper. The results showed that it can be applies to non-face-to-face robo advisor.

Enhancing the Reliability of Wi-Fi Network Using Evil Twin AP Detection Method Based on Machine Learning

  • Seo, Jeonghoon;Cho, Chaeho;Won, Yoojae
    • Journal of Information Processing Systems
    • /
    • v.16 no.3
    • /
    • pp.541-556
    • /
    • 2020
  • Wireless networks have become integral to society as they provide mobility and scalability advantages. However, their disadvantage is that they cannot control the media, which makes them vulnerable to various types of attacks. One example of such attacks is the evil twin access point (AP) attack, in which an authorized AP is impersonated by mimicking its service set identifier (SSID) and media access control (MAC) address. Evil twin APs are a major source of deception in wireless networks, facilitating message forgery and eavesdropping. Hence, it is necessary to detect them rapidly. To this end, numerous methods using clock skew have been proposed for evil twin AP detection. However, clock skew is difficult to calculate precisely because wireless networks are vulnerable to noise. This paper proposes an evil twin AP detection method that uses a multiple-feature-based machine learning classification algorithm. The features used in the proposed method are clock skew, channel, received signal strength, and duration. The results of experiments conducted indicate that the proposed method has an evil twin AP detection accuracy of 100% using the random forest algorithm.

A Review of the Methodology for Sophisticated Data Classification (정교한 데이터 분류를 위한 방법론의 고찰)

  • Kim, Seung Jae;Kim, Sung Hwan
    • Journal of Integrative Natural Science
    • /
    • v.14 no.1
    • /
    • pp.27-34
    • /
    • 2021
  • 전 세계적으로 인공지능(AI)을 구현하려는 움직임이 많아지고 있다. AI구현에서는 많은 양의 데이터, 목적에 맞는 데이터의 분류 등 데이터의 중요성을 뺄 수 없다. 이러한 데이터를 생성하고 가공하는 기술에는 사물인터넷(IOT)과 빅데이터(Big-data) 분석이 있으며 4차 산업을 이끌어 가는 원동력이라 할 수 있다. 또한 이러한 기술은 국가와 개인 차원에서 많이 활용되고 있으며, 특히나 특정분야에 집결되는 데이터를 기준으로 빅데이터 분석에 활용함으로써 새로운 모델을 발견하고, 그 모델로 새로운 값을 추론하고 예측함으로써 미래비전을 제시하려는 시도가 많아지고 있는 추세이다. 데이터 분석을 통한 결론은 데이터가 가지고 있는 정보의 정확성에 따라 많은 변화를 가져올 수 있으며, 그 변화에 따라 잘못된 결과를 발생시킬 수도 있다. 이렇듯 데이터의 분석은 데이터가 가지는 정보 또는 분석 목적에 맞는 데이터 분류가 매우 중요하다는 것을 알 수 있다. 또한 빅데이터 분석결과 통계량의 신뢰성과 정교함을 얻기 위해서는 각 변수의 의미와 변수들 간의 상관관계, 다중공선성 등을 고려하여 분석해야 한다. 즉, 빅데이터 분석에 앞서 분석목적에 맞도록 데이터의 분류가 잘 이루어지도록 해야 한다. 이에 본 고찰에서는 AI기술을 구현하는 머신러닝(machine learning, ML) 기법에 속하는 분류분석(classification analysis, CA) 중 의사결정트리(decision tree, DT)기법, 랜덤포레스트(random forest, RF)기법, 선형분류분석(linear discriminant analysis, LDA), 이차선형분류분석(quadratic discriminant analysis, QDA)을 이용하여 데이터를 분류한 후 데이터의 분류정도를 평가함으로써 데이터의 분류 분석률 향상을 위한 방안을 모색하려 한다.

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

  • Yang, Zhenwei;Shen, Liquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4476-4491
    • /
    • 2021
  • Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

A Machine Learning-based Real-time Monitoring System for Classification of Elephant Flows on KOREN

  • Akbar, Waleed;Rivera, Javier J.D.;Ahmed, Khan T.;Muhammad, Afaq;Song, Wang-Cheol
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2801-2815
    • /
    • 2022
  • With the advent and realization of Software Defined Network (SDN) architecture, many organizations are now shifting towards this paradigm. SDN brings more control, higher scalability, and serene elasticity. The SDN spontaneously changes the network configuration according to the dynamic network requirements inside the constrained environments. Therefore, a monitoring system that can monitor the physical and virtual entities is needed to operate this type of network technology with high efficiency and proficiency. In this manuscript, we propose a real-time monitoring system for data collection and visualization that includes the Prometheus, node exporter, and Grafana. A node exporter is configured on the physical devices to collect the physical and virtual entities resources utilization logs. A real-time Prometheus database is configured to collect and store the data from all the exporters. Furthermore, the Grafana is affixed with Prometheus to visualize the current network status and device provisioning. A monitoring system is deployed on the physical infrastructure of the KOREN topology. Data collected by the monitoring system is further pre-processed and restructured into a dataset. A monitoring system is further enhanced by including machine learning techniques applied on the formatted datasets to identify the elephant flows. Additionally, a Random Forest is trained on our generated labeled datasets, and the classification models' performance are verified using accuracy metrics.

Comprehensive System Framework for Visual Fatigue and Cognitive Performance Management based on Predictive Models

  • Dahyun JUNG;Hakpyeong KIM;Seunghoon JUNG;Hyuna KANG;Seungkeun YEOM;Juui KIM;Taehoon HONG
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.988-995
    • /
    • 2024
  • With the modern workplace's increasing dependence on computer-based tasks, traditional lighting standards have been identified as insufficient for optimal occupant comfort and productivity. Therefore, this paper presents a comprehensive system framework designed to manage visual fatigue and cognitive performance within office environments. Classification and regression models using gradient boosting machine and random forest to predict visual fatigue and cognitive performance were developed based on data collected from 16 subjects in experiments. To this end, the proposed system consists of two modules: the first module predicts visual fatigue and cognitive performance levels using classification models, offering immediate feedback to occupants. The second module, targeted at facility managers, uses regression models and a genetic algorithm to identify optimal lighting settings, aiming to minimize visual fatigue and enhance cognitive performance. This system can help to manage visual fatigue and cognitive performance simultaneously, contributing to improvement of eye health and productivity.