• Title/Summary/Keyword: Automated machine learning

Search Result 195, Processing Time 0.026 seconds

AutoFe-Sel: A Meta-learning based methodology for Recommending Feature Subset Selection Algorithms

  • Irfan Khan;Xianchao Zhang;Ramesh Kumar Ayyasam;Rahman Ali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1773-1793
    • /
    • 2023
  • Automated machine learning, often referred to as "AutoML," is the process of automating the time-consuming and iterative procedures that are associated with the building of machine learning models. There have been significant contributions in this area across a number of different stages of accomplishing a data-mining task, including model selection, hyper-parameter optimization, and preprocessing method selection. Among them, preprocessing method selection is a relatively new and fast growing research area. The current work is focused on the recommendation of preprocessing methods, i.e., feature subset selection (FSS) algorithms. One limitation in the existing studies regarding FSS algorithm recommendation is the use of a single learner for meta-modeling, which restricts its capabilities in the metamodeling. Moreover, the meta-modeling in the existing studies is typically based on a single group of data characterization measures (DCMs). Nonetheless, there are a number of complementary DCM groups, and their combination will allow them to leverage their diversity, resulting in improved meta-modeling. This study aims to address these limitations by proposing an architecture for preprocess method selection that uses ensemble learning for meta-modeling, namely AutoFE-Sel. To evaluate the proposed method, we performed an extensive experimental evaluation involving 8 FSS algorithms, 3 groups of DCMs, and 125 datasets. Results show that the proposed method achieves better performance compared to three baseline methods. The proposed architecture can also be easily extended to other preprocessing method selections, e.g., noise-filter selection and imbalance handling method selection.

Detection of Coffee Bean Defects using Convolutional Neural Networks (Convolutional Neural Network를 이용한 불량원두 검출 시스템)

  • Kim, Ho-Joong;Cho, Tai-Hoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.316-319
    • /
    • 2014
  • People's interests in coffee are increasing with the expansion of coffee market. In this trend, people's taste becomes more luxurious and coffee bean's quality is considered to be very important. Currently, bean defects are mainly detected by experienced specialists. In this paper, a detection system of bean defects using machine learning is presented. This system concentrates on detecting two main defect types : bean's shape and insect damage. Convolutional Neural Networks are used for machine learning. The neural networks are comprised of two neural networks. The first neural network detects defects in the bean's shape, and the second one detects the bean's insect damage. The development of this system could be a starting point for automated coffee bean defects detection. Later, further research is needed to detect other bean defect types.

  • PDF

Classification of Security Checklist Items based on Machine Learning to Manage Security Checklists Efficiently (보안 점검 목록을 효율적으로 관리하기 위한 머신러닝 기반의 보안 점검 항목 분류)

  • Hyun Kyung Park;Hyo Beom Ahn
    • Smart Media Journal
    • /
    • v.11 no.11
    • /
    • pp.75-83
    • /
    • 2022
  • NIST in the United States has developed SCAP, a protocol that enables automated inspection and management of security vulnerability using existing standards such as CVE and CPE. SCAP operates by creating a checklist using the XCCDF and OVAL languages and running the prepared checklist with the SCAP tool such as the SCAP Workbench made by OpenSCAP to return the check result. SCAP checklist files for various operating systems are shared through the NCP community, and the checklist files include ID, title, description, and inspection method for each item. However, since the inspection items are simply listed in the order in which they are written, so it is necessary to classify and manage the items by type so that the security manager can systematically manage them using the SCAP checklist file. In this study, we propose a method of extracting the description of each inspection item from the SCAP checklist file written in OVAL language, classifying the categories through a machine learning model, and outputting the SCAP check results for each classified item.

Prediction of intensive care unit admission using machine learning in patients with odontogenic infection

  • Joo-Ha Yoon;Sung Min Park
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.50 no.4
    • /
    • pp.216-221
    • /
    • 2024
  • Objectives: This study aimed to develop and validate a model to predict the need for intensive care unit (ICU) admission in patients with dental infections using an automated machine learning (ML) program called H2O-AutoML. Materials and Methods: Two models were created using only the information available at the initial examination. Model 1 was parameterized with only clinical symptoms and blood tests, excluding contrast-enhanced multi-detector computed tomography (MDCT) images available at the initial visit, whereas model 2 was created with the addition of the MDCT information to the model 1 parameters. Although model 2 was expected to be superior to model 1, we wanted to independently determine this conclusion. A total of 210 patients who visited the Department of Oral and Maxillofacial Surgery at the Dankook University Dental Hospital from March 2013 to August 2023 was included in this study. The patients' demographic characteristics (sex, age, and place of residence), systemic factors (hypertension, diabetes mellitus [DM], kidney disease, liver disease, heart disease, anticoagulation therapy, and osteoporosis), local factors (smoking status, site of infection, postoperative wound infection, dysphagia, odynophagia, and trismus), and factors known from initial blood tests were obtained from their medical charts and retrospectively reviewed. Results: The generalized linear model algorithm provided the best diagnostic accuracy, with an area under the receiver operating characteristic values of 0.8289 in model 1 and 0.8415 in model 2. In both models, the C-reactive protein level was the most important variable, followed by DM. Conclusion: This study provides unprecedented data on the use of ML for successful prediction of ICU admission based on initial examination results. These findings will considerably contribute to the development of the field of dentistry, especially oral and maxillofacial surgery.

The Evaluation of a Plastic Material Classification System using Near Field IR (NIR) Spectrum and Decision Tree based Machine Learning (Near Field IR (NIR) 스펙트럼 및 결정 트리 기반 기계학습을 이용한 플라스틱 재질 분류 시스템)

  • Kook, Joongjin
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.3
    • /
    • pp.92-97
    • /
    • 2022
  • Plastics are classified into 7 types such as PET (PETE), HDPE, PVC, LDPE, PP, PS, and Other for separation and recycling. Recently, large corporations advocating ESG management are replacing them with bioplastics. Incineration and landfill of disposal of plastic waste are responsible for air pollution and destruction of the ecosystem. Because it is not easy to accurately classify plastic materials with the naked eye, automated system-based screening studies using various sensor technologies and AI-based software technologies have been conducted. In this paper, NIR scanning devices considering the NIR wavelength characteristics that appear differently for each plastic material and a system that can identify the type of plastic by learning the NIR spectrum data collected through it. The accuracy of plastic material identification was evaluated through a decision tree-based SVM model for multiclass classification on NIR spectral datasets for 8 types of plastic samples including biodegradable plastic.

User Interface Application for Cancer Classification using Histopathology Images

  • Naeem, Tayyaba;Qamar, Shamweel;Park, Peom
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.17 no.2
    • /
    • pp.91-97
    • /
    • 2021
  • User interface for cancer classification system is a software application with clinician's friendly tools and functions to diagnose cancer from pathology images. Pathology evolved from manual diagnosis to computer-aided diagnosis with the help of Artificial Intelligence tools and algorithms. In this paper, we explained each block of the project life cycle for the implementation of automated breast cancer classification software using AI and machine learning algorithms to classify normal and invasive breast histology images. The system was designed to help the pathologists in an automatic and efficient diagnosis of breast cancer. To design the classification model, Hematoxylin and Eosin (H&E) stained breast histology images were obtained from the ICIAR Breast Cancer challenge. These images are stain normalized to minimize the error that can occur during model training due to pathological stains. The normalized dataset was fed into the ResNet-34 for the classification of normal and invasive breast cancer images. ResNet-34 gave 94% accuracy, 93% F Score, 95% of model Recall, and 91% precision.

Combining Multiple Classifiers for Automatic Classification of Email Documents (전자우편 문서의 자동분류를 위한 다중 분류기 결합)

  • Lee, Jae-Haeng;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.192-201
    • /
    • 2002
  • Automated text classification is considered as an important method to manage and process a huge amount of documents in digital forms that are widespread and continuously increasing. Recently, text classification has been addressed with machine learning technologies such as k-nearest neighbor, decision tree, support vector machine and neural networks. However, only few investigations in text classification are studied on real problems but on well-organized text corpus, and do not show their usefulness. This paper proposes and analyzes text classification methods for a real application, email document classification task. First, we propose a combining method of multiple neural networks that improves the performance through the combinations with maximum and neural networks. Second, we present another strategy of combining multiple machine learning classifiers. Voting, Borda count and neural networks improve the overall classification performance. Experimental results show the usefulness of the proposed methods for a real application domain, yielding more than 90% precision rates.

Learning Tagging Ontology from Large Tagging Data (대규모 태깅 데이터를 이용한 태깅 온톨로지 학습)

  • Kang, Sin-Jae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.2
    • /
    • pp.157-162
    • /
    • 2008
  • This paper presents a learning method of tagging ontology using large tagging data such as a folksonomy, which stands for classification structure informally created by the people. There is no common agreement about the semantics of a tagging, and most social web sites internally use different methods to represent tagging information, obstructing interoperability between sites and the automated processing by software agents. To solve this problem, we need a tagging ontology, defined by analyzing intrinsic attributes of a tagging. Through several machine learning for tagging data, tag groups and similar user groups are extracted, and then used to learn the tagging ontology. A recommender system adopting the tagging ontology is also suggested as an applying field.

Construction Scheme of Training Data using Automated Exploring of Boundary Categories (경계범주 자동탐색에 의한 확장된 학습체계 구성방법)

  • Choi, Yun-Jeong;Jee, Jeong-Gyu;Park, Seung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.479-488
    • /
    • 2009
  • This paper shows a reinforced construction scheme of training data for improvement of text classification by automatic search of boundary category. The documents laid on boundary area are usually misclassified as they are including multiple topics and features. which is the main factor that we focus on. In this paper, we propose an automated exploring methodology of optimal boundary category based on previous research. We consider the boundary area among target categories to new category to be required training, which are then added to the target category sementically. In experiments, we applied our method to complex documents by intentionally making errors in training process. The experimental results show that our system has high accuracy and reliability in noisy environment.

Object Detection of AGV in Manufacturing Plants using Deep Learning (딥러닝 기반 제조 공장 내 AGV 객체 인식에 대한 연구)

  • Lee, Gil-Won;Lee, Hwally;Cheong, Hee-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.36-43
    • /
    • 2021
  • In this research, the accuracy of YOLO v3 algorithm in object detection during AGV (Automated Guided Vehicle) operation was investigated. First of all, AGV with 2D LiDAR and stereo camera was prepared. AGV was driven along the route scanned with SLAM (Simultaneous Localization and Mapping) using 2D LiDAR while front objects were detected through stereo camera. In order to evaluate the accuracy of YOLO v3 algorithm, recall, AP (Average Precision), and mAP (mean Average Precision) of the algorithm were measured with a degree of machine learning. Experimental results show that mAP, precision, and recall are improved by 10%, 6.8%, and 16.4%, respectively, when YOLO v3 is fitted with 4000 training dataset and 500 testing dataset which were collected through online search and is trained additionally with 1200 dataset collected from the stereo camera on AGV.