• Title/Summary/Keyword: automatic machine learning

Search Result 302, Processing Time 0.037 seconds

Wellness Prediction in Diabetes Mellitus Risks Via Machine Learning Classifiers

  • Saravanakumar M, Venkatesh;Sabibullah, M.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.203-208
    • /
    • 2022
  • The occurrence of Type 2 Diabetes Mellitus (T2DM) is hoarding globally. All kinds of Diabetes Mellitus is controlled to disrupt over 415 million grownups worldwide. It was the seventh prime cause of demise widespread with a measured 1.6 million deaths right prompted by diabetes during 2016. Over 90% of diabetes cases are T2DM, with the utmost persons having at smallest one other chronic condition in UK. In valuation of contemporary applications of Big Data (BD) to Diabetes Medicare by sighted its upcoming abilities, it is compulsory to transmit out a bottomless revision over foremost theoretical literatures. The long-term growth in medicine and, in explicit, in the field of "Diabetology", is powerfully encroached to a sequence of differences and inventions. The medical and healthcare data from varied bases like analysis and treatment tactics which assistances healthcare workers to guess the actual perceptions about the development of Diabetes Medicare measures accessible by them. Apache Spark extracts "Resilient Distributed Dataset (RDD)", a vital data structure distributed finished a cluster on machines. Machine Learning (ML) deals a note-worthy method for building elegant and automatic algorithms. ML library involving of communal ML algorithms like Support Vector Classification and Random Forest are investigated in this projected work by using Jupiter Notebook - Python code, where significant quantity of result (Accuracy) is carried out by the models.

Compositional Feature Selection and Its Effects on Bandgap Prediction by Machine Learning (기계학습을 이용한 밴드갭 예측과 소재의 조성기반 특성인자의 효과)

  • Chunghee Nam
    • Korean Journal of Materials Research
    • /
    • v.33 no.4
    • /
    • pp.164-174
    • /
    • 2023
  • The bandgap characteristics of semiconductor materials are an important factor when utilizing semiconductor materials for various applications. In this study, based on data provided by AFLOW (Automatic-FLOW for Materials Discovery), the bandgap of a semiconductor material was predicted using only the material's compositional features. The compositional features were generated using the python module of 'Pymatgen' and 'Matminer'. Pearson's correlation coefficients (PCC) between the compositional features were calculated and those with a correlation coefficient value larger than 0.95 were removed in order to avoid overfitting. The bandgap prediction performance was compared using the metrics of R2 score and root-mean-squared error. By predicting the bandgap with randomforest and xgboost as representatives of the ensemble algorithm, it was found that xgboost gave better results after cross-validation and hyper-parameter tuning. To investigate the effect of compositional feature selection on the bandgap prediction of the machine learning model, the prediction performance was studied according to the number of features based on feature importance methods. It was found that there were no significant changes in prediction performance beyond the appropriate feature. Furthermore, artificial neural networks were employed to compare the prediction performance by adjusting the number of features guided by the PCC values, resulting in the best R2 score of 0.811. By comparing and analyzing the bandgap distribution and prediction performance according to the material group containing specific elements (F, N, Yb, Eu, Zn, B, Si, Ge, Fe Al), various information for material design was obtained.

Application of machine learning for merging multiple satellite precipitation products

  • Van, Giang Nguyen;Jung, Sungho;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.134-134
    • /
    • 2021
  • Precipitation is a crucial component of water cycle and play a key role in hydrological processes. Traditionally, gauge-based precipitation is the main method to achieve high accuracy of rainfall estimation, but its distribution is sparsely in mountainous areas. Recently, satellite-based precipitation products (SPPs) provide grid-based precipitation with spatio-temporal variability, but SPPs contain a lot of uncertainty in estimated precipitation, and the spatial resolution quite coarse. To overcome these limitations, this study aims to generate new grid-based daily precipitation using Automatic weather system (AWS) in Korea and multiple SPPs(i.e. CHIRPSv2, CMORPH, GSMaP, TRMMv7) during the period of 2003-2017. And this study used a machine learning based Random Forest (RF) model for generating new merging precipitation. In addition, several statistical linear merging methods are used to compare with the results of the RF model. In order to investigate the efficiency of RF, observed data from 64 observed Automated Synoptic Observation System (ASOS) were collected to evaluate the accuracy of the products through Kling-Gupta efficiency (KGE), probability of detection (POD), false alarm rate (FAR), and critical success index (CSI). As a result, the new precipitation generated through the random forest model showed higher accuracy than each satellite rainfall product and spatio-temporal variability was better reflected than other statistical merging methods. Therefore, a random forest-based ensemble satellite precipitation product can be efficiently used for hydrological simulations in ungauged basins such as the Mekong River.

  • PDF

Feature Engineering and Evaluation for Android Malware Detection Scheme

  • Jaemin Jung;Jihyeon Park;Seong-je Cho;Sangchul Han;Minkyu Park;Hsin-Hung Cho
    • Journal of Internet Technology
    • /
    • v.22 no.2
    • /
    • pp.423-439
    • /
    • 2021
  • Android is one of the most popular platforms for the mobile and Internet of Things (IoT) devices. This popularity has made Android-based devices a valuable target of malicious apps. Thus, it is essential to devise automatic and portable malware detection approaches for the Android platform. There are many studies on detecting mobile malware using machine learning techniques. In these studies, however, the dataset is imbalanced or is not large enough to generalize the machine learning model, or the dimensionality of features is too high to apply nonlinear classifiers. In this article, we propose a machine learning-based Android malware detection scheme that uses API calls and permissions as features. To restrict the dimensionality of features, we propose minimal domain knowledge-based and Gini importance-based feature selection. We construct large and balanced real-world datasets to build a generalized and non-skewed model and verify our model through experiments. We achieve 96.51% classification accuracy using Random Forest classifier with low overhead. In addition, we also provide an analysis on falsely classified samples in detail. The analysis results show that API hiding can degrade the performance of API call information-based malware detection systems.

Structural review of the intelligent online judge system (지능형 온라인 평가 시스템의 구조적 고찰)

  • Lim, Isaac;Cho, Minwoo;Lee, Jisu;Jang, Jiwon;Choi, Jiyoung;Jung, Heokyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.499-501
    • /
    • 2021
  • Recently, artificial intelligence and SW have occupied an important position worldwide as the foundation technology of the era of the 4th industrial revolution, and web browser-based programming learning systems are becoming common due to changes in the learning environment caused by COVID-19. In accordance with this trend, this paper proposes a functionally scalable microservice-based system structure for an online evaluation system as a tool for learning algorithms that are the basis of artificial intelligence and SW. In addition, a functional structure for applying machine learning to automatic evaluation functions under the proposed system structure is also proposed.

  • PDF

Combining Multiple Classifiers for Automatic Classification of Email Documents (전자우편 문서의 자동분류를 위한 다중 분류기 결합)

  • Lee, Jae-Haeng;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.192-201
    • /
    • 2002
  • Automated text classification is considered as an important method to manage and process a huge amount of documents in digital forms that are widespread and continuously increasing. Recently, text classification has been addressed with machine learning technologies such as k-nearest neighbor, decision tree, support vector machine and neural networks. However, only few investigations in text classification are studied on real problems but on well-organized text corpus, and do not show their usefulness. This paper proposes and analyzes text classification methods for a real application, email document classification task. First, we propose a combining method of multiple neural networks that improves the performance through the combinations with maximum and neural networks. Second, we present another strategy of combining multiple machine learning classifiers. Voting, Borda count and neural networks improve the overall classification performance. Experimental results show the usefulness of the proposed methods for a real application domain, yielding more than 90% precision rates.

Joint streaming model for backchannel prediction and automatic speech recognition

  • Yong-Seok Choi;Jeong-Uk Bang;Seung Hi Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.118-126
    • /
    • 2024
  • In human conversations, listeners often utilize brief backchannels such as "uh-huh" or "yeah." Timely backchannels are crucial to understanding and increasing trust among conversational partners. In human-machine conversation systems, users can engage in natural conversations when a conversational agent generates backchannels like a human listener. We propose a method that simultaneously predicts backchannels and recognizes speech in real time. We use a streaming transformer and adopt multitask learning for concurrent backchannel prediction and speech recognition. The experimental results demonstrate the superior performance of our method compared with previous works while maintaining a similar single-task speech recognition performance. Owing to the extremely imbalanced training data distribution, the single-task backchannel prediction model fails to predict any of the backchannel categories, and the proposed multitask approach substantially enhances the backchannel prediction performance. Notably, in the streaming prediction scenario, the performance of backchannel prediction improves by up to 18.7% compared with existing methods.

Smart Home Service System Considering Indoor and Outdoor Environment and User Behavior (실내외 환경과 사용자의 행동을 고려한 스마트 홈 서비스 시스템)

  • Kim, Jae-Jung;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.23 no.5
    • /
    • pp.473-480
    • /
    • 2019
  • The smart home is a technology that can monitor and control by connecting everything to a communication network in various fields such as home appliances, energy consumers, and security devices. The Smart home is developing not only automatic control but also learning situation and user's taste and providing the result accordingly. This paper proposes a model that can provide a comfortable indoor environment control service for the user's characteristics by detecting the user's behavior as well as the automatic remote control service. The whole system consists of ESP 8266 with sensor and Wi-Fi, Firebase as a real-time database, and a smartphone application. This model is divided into functions such as learning mode when the home appliance is operated, learning control through learning results, and automatic ventilation using indoor and outdoor sensor values. The study used moving averages for temperature and humidity in the control of home appliances such as air conditioners, humidifiers and air purifiers. This system can provide higher quality service by analyzing and predicting user's characteristics through various machine learning and deep learning.

Machine Learning-Based Malicious URL Detection Technique (머신러닝 기반 악성 URL 탐지 기법)

  • Han, Chae-rim;Yun, Su-hyun;Han, Myeong-jin;Lee, Il-Gu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.555-564
    • /
    • 2022
  • Recently, cyberattacks are using hacking techniques utilizing intelligent and advanced malicious codes for non-face-to-face environments such as telecommuting, telemedicine, and automatic industrial facilities, and the damage is increasing. Traditional information protection systems, such as anti-virus, are a method of detecting known malicious URLs based on signature patterns, so unknown malicious URLs cannot be detected. In addition, the conventional static analysis-based malicious URL detection method is vulnerable to dynamic loading and cryptographic attacks. This study proposes a technique for efficiently detecting malicious URLs by dynamically learning malicious URL data. In the proposed detection technique, malicious codes are classified using machine learning-based feature selection algorithms, and the accuracy is improved by removing obfuscation elements after preprocessing using Weighted Euclidean Distance(WED). According to the experimental results, the proposed machine learning-based malicious URL detection technique shows an accuracy of 89.17%, which is improved by 2.82% compared to the conventional method.

A Deep Convolutional Neural Network with Batch Normalization Approach for Plant Disease Detection

  • Albogamy, Fahad R.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.51-62
    • /
    • 2021
  • Plant disease is one of the issues that can create losses in the production and economy of the agricultural sector. Early detection of this disease for finding solutions and treatments is still a challenge in the sustainable agriculture field. Currently, image processing techniques and machine learning methods have been applied to detect plant diseases successfully. However, the effectiveness of these methods still needs to be improved, especially in multiclass plant diseases classification. In this paper, a convolutional neural network with a batch normalization-based deep learning approach for classifying plant diseases is used to develop an automatic diagnostic assistance system for leaf diseases. The significance of using deep learning technology is to make the system be end-to-end, automatic, accurate, less expensive, and more convenient to detect plant diseases from their leaves. For evaluating the proposed model, an experiment is conducted on a public dataset contains 20654 images with 15 plant diseases. The experimental validation results on 20% of the dataset showed that the model is able to classify the 15 plant diseases labels with 96.4% testing accuracy and 0.168 testing loss. These results confirmed the applicability and effectiveness of the proposed model for the plant disease detection task.